Publish Date | Title | Authors | Code | |
---|---|---|---|---|
2024-04-03 | Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction | Keyu Tian et.al. | 2404.02905v1 | link |
2024-04-03 | LidarDM: Generative LiDAR Simulation in a Generated World | Vlas Zyrianov et.al. | 2404.02903v1 | null |
2024-04-03 | MatAtlas: Text-driven Consistent Geometry Texturing and Material Assignment | Duygu Ceylan et.al. | 2404.02899v1 | null |
2024-04-03 | On the Scalability of Diffusion-based Text-to-Image Generation | Hao Li et.al. | 2404.02883v1 | null |
2024-04-03 | Fast Diffusion Model For Seismic Data Noise Attenuation | Junheng Peng et.al. | 2404.02767v1 | null |
2024-04-03 | Cross-Attention Makes Inference Cumbersome in Text-to-Image Diffusion Models | Wentian Zhang et.al. | 2404.02747v1 | link |
2024-04-03 | InstantStyle: Free Lunch towards Style-Preserving in Text-to-Image Generation | Haofan Wang et.al. | 2404.02733v1 | link |
2024-04-03 | Harnessing the Power of Large Vision Language Models for Synthetic Image Detection | Mamadou Keita et.al. | 2404.02726v1 | null |
2024-04-02 | Diffusion $^2$ : Dynamic 3D Content Generation via Score Composition of Orthogonal Diffusion Models | Zeyu Yang et.al. | 2404.02148v1 | link |
2024-04-02 | WcDT: World-centric Diffusion Transformer for Traffic Scene Generation | Chen Yang et.al. | 2404.02082v1 | link |
2024-04-02 | AUTODIFF: Autoregressive Diffusion Modeling for Structure-based Drug Design | Xinze Li et.al. | 2404.02003v1 | null |
2024-04-02 | Bi-LORA: A Vision-Language Approach for Synthetic Image Detection | Mamadou Keita et.al. | 2404.01959v1 | null |
2024-03-29 | Relation Rectification in Diffusion Model | Yinwei Wu et.al. | 2403.20249v1 | null |
2024-03-29 | Graph Neural Aggregation-diffusion with Metastability | Kaiyuan Cui et.al. | 2403.20221v1 | null |
2024-03-29 | Motion Inversion for Video Customization | Luozhou Wang et.al. | 2403.20193v1 | null |
2024-03-29 | FreeSeg-Diff: Training-Free Open-Vocabulary Segmentation with Diffusion Models | Barbara Toniella Corradini et.al. | 2403.20105v1 | null |
2024-03-29 | SGD: Street View Synthesis with Gaussian Splatting and Diffusion Prior | Zhongrui Yu et.al. | 2403.20079v1 | null |
2024-03-29 | Optimal s-boxes against alternative operations | Marco Calderini et.al. | 2403.20059v1 | null |
2024-03-28 | GaussianCube: Structuring Gaussian Splatting using Optimal Transport for 3D Generative Modeling | Bowen Zhang et.al. | 2403.19655v1 | null |
2024-03-28 | Detecting Image Attribution for Text-to-Image Diffusion Models in RGB and Beyond | Katherine Xu et.al. | 2403.19653v1 | link |
2024-03-28 | InterDreamer: Zero-Shot Text to 3D Dynamic Human-Object Interaction | Sirui Xu et.al. | 2403.19652v1 | null |
2024-03-28 | GANTASTIC: GAN-based Transfer of Interpretable Directions for Disentangled Image Editing in Text-to-Image Diffusion Models | Yusuf Dalva et.al. | 2403.19645v1 | null |
2024-03-28 | Generalisation of the Spectral Difference scheme for the diffused-interface five equation model | Niccolò Tonicello et.al. | 2403.19623v1 | null |
2024-03-28 | Enhance Image Classification via Inter-Class Image Mixup with Diffusion Model | Zhicai Wang et.al. | 2403.19600v1 | link |
2024-03-28 | Frame by Familiar Frame: Understanding Replication in Video Diffusion Models | Aimon Rahman et.al. | 2403.19593v1 | null |
2024-03-28 | Keypoint Action Tokens Enable In-Context Imitation Learning in Robotics | Norman Di Palo et.al. | 2403.19578v1 | null |
2024-03-27 | ObjectDrop: Bootstrapping Counterfactuals for Photorealistic Object Removal and Insertion | Daniel Winter et.al. | 2403.18818v1 | null |
2024-03-27 | Garment3DGen: 3D Garment Stylization and Texture Generation | Nikolaos Sarafianos et.al. | 2403.18816v1 | null |
2024-03-28 | ECoDepth: Effective Conditioning of Diffusion Models for Monocular Depth Estimation | Suraj Patni et.al. | 2403.18807v2 | link |
2024-03-27 | Object Pose Estimation via the Aggregation of Diffusion Features | Tianfu Wang et.al. | 2403.18791v1 | link |
2024-03-27 | ImageNet-D: Benchmarking Neural Network Robustness on Diffusion Synthetic Object | Chenshuang Zhang et.al. | 2403.18775v1 | link |
2024-03-28 | FlexEdit: Flexible and Controllable Diffusion-based Object-centric Image Editing | Trong-Tung Nguyen et.al. | 2403.18605v2 | null |
2024-03-27 | HandBooster: Boosting 3D Hand-Mesh Reconstruction by Conditional Synthesis and Sampling of Hand-Object Interactions | Hao Xu et.al. | 2403.18575v1 | link |
2024-03-26 | ConvoFusion: Multi-Modal Conversational Diffusion for Co-Speech Gesture Synthesis | Muhammad Hamza Mughal et.al. | 2403.17936v1 | null |
2024-03-26 | SLEDGE: Synthesizing Simulation Environments for Driving Agents with Generative Models | Kashyap Chitta et.al. | 2403.17933v1 | null |
2024-03-26 | AID: Attention Interpolation of Text-to-Image Diffusion | Qiyuan He et.al. | 2403.17924v1 | link |
2024-03-26 | Boosting Diffusion Models with Moving Average Sampling in Frequency Domain | Yurui Qian et.al. | 2403.17870v1 | null |
2024-03-26 | The memory of Rayleigh-Taylor turbulence | S. Thévenin et.al. | 2403.17832v1 | null |
2024-03-26 | DiffH2O: Diffusion-Based Synthesis of Hand-Object Interactions from Textual Descriptions | Sammy Christen et.al. | 2403.17827v1 | null |
2024-03-25 | Exploiting Priors from 3D Diffusion Models for RGB-Based One-Shot View Planning | Sicong Pan et.al. | 2403.16803v1 | null |
2024-03-25 | Iso-Diffusion: Improving Diffusion Probabilistic Models Using the Isotropy of the Additive Gaussian Noise | Dilum Fernando et.al. | 2403.16790v1 | null |
2024-03-25 | Multilevel Modeling as a Methodology for the Simulation of Human Mobility | Luca Serena et.al. | 2403.16745v1 | null |
2024-03-25 | A Robotic Skill Learning System Built Upon Diffusion Policies and Foundation Models | Nils Ingelhag et.al. | 2403.16730v1 | null |
2024-03-25 | Improving Diffusion Models’s Data-Corruption Resistance using Scheduled Pseudo-Huber Loss | Artem Khrapov et.al. | 2403.16728v1 | link |
2024-03-25 | The effect of inter-track coupling on H $_2$O$_2$ productions | Ramin Abolfath et.al. | 2403.16722v1 | null |
2024-03-25 | The Directionality of Gravitational and Thermal Diffusive Transport in Geologic Fluid Storage | Anna Herring et.al. | 2403.16659v1 | null |
2024-03-25 | SDXS: Real-Time One-Step Latent Diffusion Models with Image Conditions | Yuda Song et.al. | 2403.16627v1 | link |
2024-03-25 | SatSynth: Augmenting Image-Mask Pairs through Diffusion Models for Aerial Semantic Segmentation | Aysim Toker et.al. | 2403.16605v1 | null |
2024-03-22 | DiffusionMTL: Learning Multi-Task Denoising Diffusion Model from Partially Annotated Data | Hanrong Ye et.al. | 2403.15389v1 | null |
2024-03-22 | LATTE3D: Large-scale Amortized Text-To-Enhanced3D Synthesis | Kevin Xie et.al. | 2403.15385v1 | null |
2024-03-22 | Controlled Training Data Generation with Diffusion Models | Teresa Yeo et.al. | 2403.15309v1 | null |
2024-03-22 | Parametric PDE Control with Deep Reinforcement Learning and Differentiable L0-Sparse Polynomial Policies | Nicolò Botteghi et.al. | 2403.15267v1 | null |
2024-03-22 | Spectral Motion Alignment for Video Motion Transfer using Diffusion Models | Geon Yeong Park et.al. | 2403.15249v1 | null |
2024-03-22 | Shadow Generation for Composite Image Using Diffusion model | Qingyang Liu et.al. | 2403.15234v1 | link |
2024-03-22 | Broad Instantaneous Bandwidth Microwave Spectrum Analyzer with a Microfabricated Atomic Vapor Cell | Yongqi Shi et.al. | 2403.15155v1 | null |
2024-03-22 | Oxygenation of CO and NO on Amorphous Solid Water | Meenu Upadhyay et.al. | 2403.15141v1 | null |
2024-03-21 | Simplified Diffusion Schrödinger Bridge | Zhicong Tang et.al. | 2403.14623v1 | link |
2024-03-21 | GRM: Large Gaussian Reconstruction Model for Efficient 3D Reconstruction and Generation | Yinghao Xu et.al. | 2403.14621v1 | link |
2024-03-21 | Videoshop: Localized Semantic Video Editing with Noise-Extrapolated Diffusion Inversion | Xiang Fan et.al. | 2403.14617v1 | null |
2024-03-21 | DreamReward: Text-to-3D Generation with Human Preference | Junliang Ye et.al. | 2403.14613v1 | null |
2024-03-21 | ReNoise: Real Image Inversion Through Iterative Noising | Daniel Garibi et.al. | 2403.14602v1 | null |
2024-03-21 | Click to Grasp: Zero-Shot Precise Manipulation via Visual Diffusion Descriptors | Nikolaos Tsagkas et.al. | 2403.14526v1 | null |
2024-03-21 | Style-Extracting Diffusion Models for Semi-Supervised Histopathology Segmentation | Mathias Öttl et.al. | 2403.14429v1 | null |
2024-03-20 | On Pretraining Data Diversity for Self-Supervised Learning | Hasan Abed Al Kader Hammoud et.al. | 2403.13808v1 | link |
2024-03-20 | Editing Massive Concepts in Text-to-Image Diffusion Models | Tianwei Xiong et.al. | 2403.13807v1 | link |
2024-03-20 | ZigMa: Zigzag Mamba Diffusion Model | Vincent Tao Hu et.al. | 2403.13802v1 | link |
2024-03-20 | TimeRewind: Rewinding Time with Image-and-Events Video Diffusion | Jingxi Chen et.al. | 2403.13800v1 | null |
2024-03-20 | DepthFM: Fast Monocular Depth Estimation with Flow Matching | Ming Gui et.al. | 2403.13788v1 | null |
2024-03-20 | Be-Your-Outpainter: Mastering Video Outpainting through Input-Specific Adaptation | Fu-Yun Wang et.al. | 2403.13745v1 | link |
2024-03-20 | Probabilistic Forecasting with Stochastic Interpolants and Föllmer Processes | Yifan Chen et.al. | 2403.13724v1 | null |
2024-03-19 | FouriScale: A Frequency Perspective on Training-Free High-Resolution Image Synthesis | Linjiang Huang et.al. | 2403.12963v1 | link |
2024-03-19 | FRESCO: Spatial-Temporal Correspondence for Zero-Shot Video Translation | Shuai Yang et.al. | 2403.12962v1 | link |
2024-03-19 | TexTile: A Differentiable Metric for Texture Tileability | Carlos Rodriguez-Pardo et.al. | 2403.12961v1 | null |
2024-03-19 | GVGEN: Text-to-3D Generation with Volumetric Representation | Xianglong He et.al. | 2403.12957v1 | null |
2024-03-19 | Zero-Reference Low-Light Enhancement via Physical Quadruple Priors | Wenjing Wang et.al. | 2403.12933v1 | null |
2024-03-19 | You Only Sample Once: Taming One-Step Text-To-Image Synthesis by Self-Cooperative Diffusion GANs | Yihong Luo et.al. | 2403.12931v1 | link |
2024-03-19 | Ultra-High-Resolution Image Synthesis with Pyramid Diffusion Model | Jiajie Yang et.al. | 2403.12915v1 | link |
2024-03-19 | D-Cubed: Latent Diffusion Trajectory Optimisation for Dexterous Deformable Manipulation | Jun Yamada et.al. | 2403.12861v1 | null |
2024-03-18 | Generalized Multi-Source Inference for Text Conditioned Music Diffusion Models | Emilian Postolache et.al. | 2403.11706v1 | link |
2024-03-19 | Urban Scene Diffusion through Semantic Occupancy Map | Junge Zhang et.al. | 2403.11697v2 | null |
2024-03-18 | Binary Noise for Binary Tasks: Masked Bernoulli Diffusion for Unsupervised Anomaly Detection | Julia Wolleb et.al. | 2403.11667v1 | null |
2024-03-18 | Diffusion-Based Environment-Aware Trajectory Prediction | Theodor Westny et.al. | 2403.11643v1 | null |
2024-03-18 | Arc2Face: A Foundation Model of Human Faces | Foivos Paraperas Papantoniou et.al. | 2403.11641v1 | link |
2024-03-18 | LoRA-Composer: Leveraging Low-Rank Adaptation for Multi-Concept Customization in Training-Free Diffusion Models | Yang Yang et.al. | 2403.11627v1 | link |
2024-03-18 | CRS-Diff: Controllable Generative Remote Sensing Foundation Model | Datao Tang et.al. | 2403.11614v1 | link |
2024-03-15 | Lodge: A Coarse to Fine Diffusion Network for Long Dance Generation Guided by the Characteristic Dance Primitives | Ronghui Li et.al. | 2403.10518v1 | link |
2024-03-15 | MusicHiFi: Fast High-Fidelity Stereo Vocoding | Ge Zhu et.al. | 2403.10493v1 | null |
2024-03-15 | SculptDiff: Learning Robotic Clay Sculpting from Humans with Goal Conditioned Diffusion Policy | Alison Bartsch et.al. | 2403.10401v1 | null |
2024-03-15 | Isotropic3D: Image-to-3D Generation Based on a Single CLIP Embedding | Pengkun Liu et.al. | 2403.10395v1 | link |
2024-03-15 | Denoising Task Difficulty-based Curriculum for Training Diffusion Models | Jin-Young Kim et.al. | 2403.10348v1 | null |
2024-03-15 | Towards Generalizable Deepfake Video Detection with Thumbnail Layout and Graph Reasoning | Yuting Xu et.al. | 2403.10261v1 | link |
2024-03-14 | SCP-Diff: Photo-Realistic Semantic Image Synthesis with Spatial-Categorical Joint Prior | Huan-ang Gao et.al. | 2403.09638v1 | null |
2024-03-14 | 3D-VLA: A 3D Vision-Language-Action Generative World Model | Haoyu Zhen et.al. | 2403.09631v1 | null |
2024-03-14 | Generalized Predictive Model for Autonomous Driving | Jiazhi Yang et.al. | 2403.09630v1 | link |
2024-03-14 | Make-Your-3D: Fast and Consistent Subject-Driven 3D Content Generation | Fangfu Liu et.al. | 2403.09625v1 | null |
2024-03-14 | Score-Guided Diffusion for 3D Human Recovery | Anastasis Stathopoulos et.al. | 2403.09623v1 | link |
2024-03-14 | Explore In-Context Segmentation via Latent Diffusion Models | Chaoyang Wang et.al. | 2403.09616v1 | null |
2024-03-14 | The effect of spatially-varying collision frequency on the development of the Rayleigh-Taylor instability | John Rodman et.al. | 2403.09591v1 | null |
2024-03-14 | MambaTalk: Efficient Holistic Gesture Synthesis with Selective State Space Models | Zunnan Xu et.al. | 2403.09471v1 | null |
2024-03-14 | Eta Inversion: Designing an Optimal Eta Function for Diffusion-based Real Image Editing | Wonjun Kang et.al. | 2403.09468v1 | link |
2024-03-13 | VLOGGER: Multimodal Diffusion for Embodied Avatar Synthesis | Enric Corona et.al. | 2403.08764v1 | null |
2024-03-14 | GaussCtrl: Multi-View Consistent Text-Driven 3D Gaussian Splatting Editing | Jing Wu et.al. | 2403.08733v2 | null |
2024-03-13 | Ambient Diffusion Posterior Sampling: Solving Inverse Problems with Diffusion Models trained on Corrupted Data | Asad Aali et.al. | 2403.08728v1 | link |
2024-03-13 | Historical Astronomical Diagrams Decomposition in Geometric Primitives | Syrine Kalleli et.al. | 2403.08721v1 | null |
2024-03-12 | Bridging Different Language Models and Generative Vision Models for Text-to-Image Generation | Shihao Zhao et.al. | 2403.07860v1 | link |
2024-03-12 | Quantifying and Mitigating Privacy Risks for Tabular Generative Models | Chaoyi Zhu et.al. | 2403.07842v1 | null |
2024-03-12 | MPCPA: Multi-Center Privacy Computing with Predictions Aggregation based on Denoising Diffusion Probabilistic Model | Guibo Luo et.al. | 2403.07838v1 | null |
2024-03-13 | SemCity: Semantic Scene Generation with Triplane Diffusion | Jumin Lee et.al. | 2403.07773v2 | link |
2024-03-12 | Stable-Makeup: When Real-World Makeup Transfer Meets Diffusion Model | Yuxuan Zhang et.al. | 2403.07764v1 | null |
2024-03-13 | Visual Decoding and Reconstruction via EEG Embeddings with Guided Diffusion | Dongyang Li et.al. | 2403.07721v2 | link |
2024-03-12 | SSM Meets Video Diffusion Models: Efficient Video Generation with Structured State Spaces | Yuta Oshima et.al. | 2403.07711v1 | link |
2024-03-12 | Genuine Knowledge from Practice: Diffusion Test-Time Adaptation for Video Adverse Weather Removal | Yijun Yang et.al. | 2403.07684v1 | null |
2024-03-11 | BrushNet: A Plug-and-Play Image Inpainting Model with Decomposed Dual-Branch Diffusion | Xuan Ju et.al. | 2403.06976v1 | link |
2024-03-11 | Bayesian Diffusion Models for 3D Shape Reconstruction | Haiyang Xu et.al. | 2403.06973v1 | null |
2024-03-11 | SELMA: Learning and Merging Skill-Specific Text-to-Image Experts with Auto-Generated Data | Jialu Li et.al. | 2403.06952v1 | null |
2024-03-12 | DEADiff: An Efficient Stylization Diffusion Model with Disentangled Representations | Tianhao Qi et.al. | 2403.06951v2 | link |
2024-03-08 | VideoElevator: Elevating Video Generation Quality with Versatile Text-to-Image Diffusion Models | Yabo Zhang et.al. | 2403.05438v1 | link |
2024-03-08 | DiffSF: Diffusion Models for Scene Flow Estimation | Yushan Zhang et.al. | 2403.05327v1 | link |
2024-03-07 | ObjectCompose: Evaluating Resilience of Vision-Based Models on Object-to-Background Compositional Changes | Hashmat Shadab Malik et.al. | 2403.04701v1 | link |
2024-03-07 | Delving into the Trajectory Long-tail Distribution for Muti-object Tracking | Sijia Chen et.al. | 2403.04700v1 | link |
2024-03-07 | PixArt-Σ: Weak-to-Strong Training of Diffusion Transformer for 4K Text-to-Image Generation | Junsong Chen et.al. | 2403.04692v1 | null |
2024-03-07 | Pix2Gif: Motion-Guided Diffusion for GIF Generation | Hitesh Kandala et.al. | 2403.04634v1 | null |
2024-03-06 | 3D Diffusion Policy | Yanjie Ze et.al. | 2403.03954v1 | link |
2024-03-06 | GUIDE: Guidance-based Incremental Learning with Diffusion Models | Bartosz Cywiński et.al. | 2403.03938v1 | link |
2024-03-06 | Hierarchical Diffusion Policy for Kinematics-Aware Multi-Task Robotic Manipulation | Xiao Ma et.al. | 2403.03890v1 | null |
2024-03-06 | Latent Dataset Distillation with Diffusion Models | Brian B. Moser et.al. | 2403.03881v1 | null |
2024-03-06 | Accelerating Convergence of Score-Based Diffusion Models, Provably | Gen Li et.al. | 2403.03852v1 | null |
2024-03-06 | Diffusion on language model embeddings for protein sequence generation | Viacheslav Meshchaninov et.al. | 2403.03726v1 | null |
2024-03-05 | Scaling Rectified Flow Transformers for High-Resolution Image Synthesis | Patrick Esser et.al. | 2403.03206v1 | null |
2024-03-05 | MAGID: An Automated Pipeline for Generating Synthetic Multi-modal Datasets | Hossein Aboutalebi et.al. | 2403.03194v1 | null |
2024-03-05 | Behavior Generation with Latent Actions | Seungjae Lee et.al. | 2403.03181v1 | link |
2024-03-05 | Enhanced beam-beam modeling to include longitudinal variation during weak-strong simulation | Derong Xu et.al. | 2403.03137v1 | null |
2024-03-02 | Bespoke Non-Stationary Solvers for Fast Sampling of Diffusion and Flow Models | Neta Shaul et.al. | 2403.01329v1 | null |
2024-03-02 | Anomalous mass dependency in Hydra endoderm cell cluster diffusion | Aline Lütz et.al. | 2403.01294v1 | null |
2024-03-02 | DiffSal: Joint Audio and Video Learning for Diffusion Saliency Prediction | Junwen Xiong et.al. | 2403.01226v1 | null |
2024-03-02 | TCIG: Two-Stage Controlled Image Generation with Quality Enhancement through Diffusion | Salaheldin Mohamed et.al. | 2403.01212v1 | null |
2024-02-29 | DistriFusion: Distributed Parallel Inference for High-Resolution Diffusion Models | Muyang Li et.al. | 2402.19481v1 | link |
2024-02-29 | Structure Preserving Diffusion Models | Haoye Lu et.al. | 2402.19369v1 | null |
2024-02-29 | A Novel Approach to Industrial Defect Generation through Blended Latent Diffusion Model with Online Adaptation | Hanxi Li et.al. | 2402.19330v1 | link |
2024-02-29 | DiffAssemble: A Unified Graph-Diffusion Model for 2D and 3D Reassembly | Gianluca Scarpellini et.al. | 2402.19302v1 | link |
2024-02-29 | Generative models struggle with kirigami metamaterials | Gerrit Felsch et.al. | 2402.19196v1 | null |
2024-02-28 | Diffusion Language Models Are Versatile Protein Learners | Xinyou Wang et.al. | 2402.18567v1 | null |
2024-02-28 | Photon statistics of resonantly driven spectrally diffusive quantum emitters | Aymeric Delteil et.al. | 2402.18542v1 | null |
2024-02-28 | Dynamical Regimes of Diffusion Models | Giulio Biroli et.al. | 2402.18491v1 | null |
2024-02-28 | Objective and Interpretable Breast Cosmesis Evaluation with Attention Guided Denoising Diffusion Anomaly Detection Model | Sangjoon Park et.al. | 2402.18362v1 | null |
2024-02-27 | Diffusion Meets DAgger: Supercharging Eye-in-hand Imitation Learning | Xiaoyu Zhang et.al. | 2402.17768v1 | null |
2024-02-27 | Seeing and Hearing: Open-domain Visual-Audio Generation with Diffusion Latent Aligners | Yazhou Xing et.al. | 2402.17723v1 | null |
2024-02-27 | Structure-Guided Adversarial Training of Diffusion Models | Ling Yang et.al. | 2402.17563v1 | null |
2024-02-27 | Scribble Hides Class: Promoting Scribble-Based Weakly-Supervised Semantic Segmentation with Its Class Label | Xinliang Zhang et.al. | 2402.17555v1 | link |
2024-02-27 | Diffusion Model-Based Image Editing: A Survey | Yi Huang et.al. | 2402.17525v1 | link |
2024-02-27 | Label-Noise Robust Diffusion Models | Byeonghu Na et.al. | 2402.17517v1 | link |
2024-02-27 | The Unwanted Dissemination of Science: The Usage of Academic Articles as Ammunition in Contested Discursive Arenas on Twitter | Richard Zhang et.al. | 2402.17495v1 | null |
2024-02-27 | EMO: Emote Portrait Alive - Generating Expressive Portrait Videos with Audio2Video Diffusion Model under Weak Conditions | Linrui Tian et.al. | 2402.17485v1 | null |
2024-02-26 | Stochastic Conditional Diffusion Models for Semantic Image Synthesis | Juyeon Ko et.al. | 2402.16506v1 | null |
2024-02-26 | Outline-Guided Object Inpainting with Diffusion Models | Markus Pobitzer et.al. | 2402.16421v1 | null |
2024-02-26 | Placing Objects in Context via Inpainting for Out-of-distribution Segmentation | Pau de Jorge et.al. | 2402.16392v1 | link |
2024-02-26 | Generative AI in Vision: A Survey on Models, Metrics and Applications | Gaurav Raut et.al. | 2402.16369v1 | null |
2024-02-26 | Feedback Efficient Online Fine-Tuning of Diffusion Models | Masatoshi Uehara et.al. | 2402.16359v1 | null |
2024-02-26 | Referee Can Play: An Alternative Approach to Conditional Generation via Model Inversion | Xuantong Liu et.al. | 2402.16305v1 | null |
2024-02-26 | Graph Diffusion Policy Optimization | Yijing Liu et.al. | 2402.16302v1 | link |
2024-02-23 | Seamless Human Motion Composition with Blended Positional Encodings | German Barquero et.al. | 2402.15509v1 | link |
2024-02-23 | Gen4Gen: Generative Data Pipeline for Generative Multi-Concept Composition | Chun-Hsiao Yeh et.al. | 2402.15504v1 | link |
2024-02-23 | Solute transport due to periodic loading in a soft porous material | Matilde Fiori et.al. | 2402.15451v1 | null |
2024-02-23 | ProTIP: Probabilistic Robustness Verification on Text-to-Image Diffusion Models against Stochastic Perturbation | Yi Zhang et.al. | 2402.15429v1 | link |
2024-02-23 | Understanding Oversmoothing in Diffusion-Based GNNs From the Perspective of Operator Semigroup Theory | Weichen Zhao et.al. | 2402.15326v1 | null |
2024-02-23 | Let’s Rectify Step by Step: Improving Aspect-based Sentiment Analysis with Diffusion Models | Shunyu Liu et.al. | 2402.15289v1 | link |
2024-02-22 | Cameras as Rays: Pose Estimation via Ray Diffusion | Jason Y. Zhang et.al. | 2402.14817v1 | null |
2024-02-22 | GeneOH Diffusion: Towards Generalizable Hand-Object Interaction Denoising via Denoising Diffusion | Xueyi Liu et.al. | 2402.14810v1 | link |
2024-02-22 | Consolidating Attention Features for Multi-view Image Editing | Or Patashnik et.al. | 2402.14792v1 | null |
2024-02-22 | Customize-A-Video: One-Shot Motion Customization of Text-to-Video Diffusion Models | Yixuan Ren et.al. | 2402.14780v1 | null |
2024-02-22 | Two-stage Cytopathological Image Synthesis for Augmenting Cervical Abnormality Screening | Zhenrong Shen et.al. | 2402.14707v1 | null |
2024-02-22 | Debiasing Text-to-Image Diffusion Models | Ruifei He et.al. | 2402.14577v1 | null |
2024-02-22 | DynGMA: a robust approach for learning stochastic differential equations from data | Aiqing Zhu et.al. | 2402.14475v1 | link |
2024-02-21 | D-Flow: Differentiating through Flows for Controlled Generation | Heli Ben-Hamu et.al. | 2402.14017v1 | null |
2024-02-21 | SDXL-Lightning: Progressive Adversarial Diffusion Distillation | Shanchuan Lin et.al. | 2402.13929v1 | null |
2024-02-21 | Non-asymptotic Convergence of Discrete-time Diffusion Models: New Approach and Improved Rate | Yuchen Liang et.al. | 2402.13901v1 | null |
2024-02-21 | NeuralDiffuser: Controllable fMRI Reconstruction with Primary Visual Feature Guided Diffusion | Haoyu Li et.al. | 2402.13809v1 | null |
2024-02-21 | The Geography of Information Diffusion in Online Discourse on Europe and Migration | Elisa Leonardelli et.al. | 2402.13800v1 | null |
2024-02-21 | Deep Generative Models for Offline Policy Learning: Tutorial, Survey, and Perspectives on Future Directions | Jiayu Chen et.al. | 2402.13777v1 | link |
2024-02-21 | Music Style Transfer with Time-Varying Inversion of Diffusion Models | Sifei Li et.al. | 2402.13763v1 | null |
2024-02-20 | Neural Network Diffusion | Kai Wang et.al. | 2402.13144v1 | link |
2024-02-20 | Excited state-specific CASSCF theory for the torsion of ethylene | Sandra Saade et.al. | 2402.13046v1 | null |
2024-02-20 | Text-Guided Molecule Generation with Diffusion Language Model | Haisong Gong et.al. | 2402.13040v1 | link |
2024-02-20 | Visual Style Prompting with Swapping Self-Attention | Jaeseok Jeong et.al. | 2402.12974v1 | link |
2024-02-20 | CLIPping the Deception: Adapting Vision-Language Models for Universal Deepfake Detection | Sohail Ahmed Khan et.al. | 2402.12927v1 | null |
2024-02-20 | RealCompo: Dynamic Equilibrium between Realism and Compositionality Improves Text-to-Image Diffusion Models | Xinchen Zhang et.al. | 2402.12908v1 | link |
2024-02-19 | FiT: Flexible Vision Transformer for Diffusion Model | Zeyu Lu et.al. | 2402.12376v1 | link |
2024-02-19 | Analysis of Persian News Agencies on Instagram, A Words Co-occurrence Graph-based Approach | Mohammad Heydari et.al. | 2402.12272v1 | null |
2024-02-19 | Synthetic location trajectory generation using categorical diffusion models | Simon Dirmeier et.al. | 2402.12242v1 | link |
2024-02-19 | Diffusion Tempering Improves Parameter Estimation with Probabilistic Integrators for Ordinary Differential Equations | Jonas Beck et.al. | 2402.12231v1 | link |
2024-02-19 | Adversarial Feature Alignment: Balancing Robustness and Accuracy in Deep Learning via Adversarial Training | Leo Hyun Park et.al. | 2402.12187v1 | null |
2024-02-19 | Human Video Translation via Query Warping | Haiming Zhu et.al. | 2402.12099v1 | null |
2024-02-16 | Fusion of Diffusion Weighted MRI and Clinical Data for Predicting Functional Outcome after Acute Ischemic Stroke with Deep Contrastive Learning | Chia-Ling Tsai et.al. | 2402.10894v1 | null |
2024-02-16 | 3D Diffuser Actor: Policy Diffusion with 3D Scene Representations | Tsung-Wei Ke et.al. | 2402.10885v1 | null |
2024-02-16 | Control Color: Multimodal Diffusion-based Interactive Image Colorization | Zhexin Liang et.al. | 2402.10855v1 | null |
2024-02-16 | Training Class-Imbalanced Diffusion Model Via Overlap Optimization | Divin Yan et.al. | 2402.10821v1 | link |
2024-02-16 | VATr++: Choose Your Words Wisely for Handwritten Text Generation | Bram Vanherle et.al. | 2402.10798v1 | null |
2024-02-16 | Rethinking Human-like Translation Strategy: Integrating Drift-Diffusion Model with Large Language Models for Machine Translation | Hongbin Na et.al. | 2402.10699v1 | null |
2024-02-16 | Decomposition for Enhancing Attention: Improving LLM-based Text-to-SQL through Workflow Paradigm | Yuanzhen Xie et.al. | 2402.10671v1 | link |
2024-02-15 | Self-Play Fine-Tuning of Diffusion Models for Text-to-Image Generation | Huizhuo Yuan et.al. | 2402.10210v1 | null |
2024-02-15 | Recovering the Pre-Fine-Tuning Weights of Generative Models | Eliahu Horwitz et.al. | 2402.10208v1 | link |
2024-02-15 | Rewards-in-Context: Multi-objective Alignment of Foundation Models with Dynamic Preference Adjustment | Rui Yang et.al. | 2402.10207v1 | link |
2024-02-15 | Energy Flux Decomposition in Magnetohydrodynamic Turbulence | D. Capocci et.al. | 2402.10125v1 | null |
2024-02-15 | Collision efficiency of droplets across diffusive, electrostatic and inertial regimes | Florian Poydenot et.al. | 2402.10117v1 | null |
2024-02-15 | Quantized Embedding Vectors for Controllable Diffusion Language Models | Cheng Kang et.al. | 2402.10107v1 | null |
2024-02-15 | Classification Diffusion Models | Shahar Yadin et.al. | 2402.10095v1 | null |
2024-02-14 | Magic-Me: Identity-Specific Video Customized Diffusion | Ze Ma et.al. | 2402.09368v1 | link |
2024-02-14 | Leveraging Pre-Trained Autoencoders for Interpretable Prototype Learning of Music Audio | Pablo Alonso-Jiménez et.al. | 2402.09318v1 | null |
2024-02-14 | Synthesizing Knowledge-enhanced Features for Real-world Zero-shot Food Detection | Pengfei Zhou et.al. | 2402.09242v1 | link |
2024-02-13 | IM-3D: Iterative Multiview Diffusion and Reconstruction for High-Quality 3D Generation | Luke Melas-Kyriazi et.al. | 2402.08682v1 | null |
2024-02-13 | Target Score Matching | Valentin De Bortoli et.al. | 2402.08667v1 | null |
2024-02-13 | Learning Continuous 3D Words for Text-to-Image Generation | Ta-Ying Cheng et.al. | 2402.08654v1 | null |
2024-02-13 | Latent Inversion with Timestep-aware Sampling for Training-free Non-rigid Editing | Yunji Jung et.al. | 2402.08601v1 | null |
2024-02-13 | Denoising Diffusion Restoration Tackles Forward and Inverse Problems for the Laplace Operator | Amartya Mukherjee et.al. | 2402.08563v1 | null |
2024-02-13 | Confronting Reward Overoptimization for Diffusion Models: A Perspective of Inductive and Primacy Biases | Ziyi Zhang et.al. | 2402.08552v1 | null |
2024-02-13 | Hyperballistic transport in dense ionized matter under external AC electric fields | Daniele Gamba et.al. | 2402.08519v1 | null |
2024-02-12 | Label-Efficient Model Selection for Text Generation | Shir Ashury-Tahan et.al. | 2402.07891v1 | null |
2024-02-12 | High-order harmonic generation in 2D Transition Metal Disulphides | Jose Manuel Iglesias et.al. | 2402.07850v1 | null |
2024-02-12 | Diffusion of Thoughts: Chain-of-Thought Reasoning in Diffusion Language Models | Jiacheng Ye et.al. | 2402.07754v1 | link |
2024-02-12 | Topological Edge States in Reconfigurable Multi-stable Mechanical Metamaterials | Zhen Wang et.al. | 2402.07707v1 | null |
2024-02-12 | Higher-order Connection Laplacians for Directed Simplicial Complexes | Xue Gong et.al. | 2402.07631v1 | null |
2024-02-09 | Diffusion-ES: Gradient-free Planning with Diffusion for Autonomous Driving and Zero-Shot Instruction Following | Brian Yang et.al. | 2402.06559v1 | null |
2024-02-09 | Sequential Flow Matching for Generative Modeling | Jongmin Yoon et.al. | 2402.06461v1 | null |
2024-02-09 | ControlUDA: Controllable Diffusion-assisted Unsupervised Domain Adaptation for Cross-Weather Semantic Segmentation | Fengyi Shen et.al. | 2402.06446v1 | null |
2024-02-09 | Improving 2D-3D Dense Correspondences with Diffusion Models for 6D Object Pose Estimation | Peter Hönig et.al. | 2402.06436v1 | null |
2024-02-09 | Enhanced bubble growth near an advancing solidification front | Jochem G. Meijer et.al. | 2402.06409v1 | null |
2024-02-08 | InstaGen: Enhancing Object Detection by Training on Synthetic Dataset | Chengjian Feng et.al. | 2402.05937v1 | null |
2024-02-08 | Time Series Diffusion in the Frequency Domain | Jonathan Crabbé et.al. | 2402.05933v1 | link |
2024-02-08 | AvatarMMC: 3D Head Avatar Generation and Editing with Multi-Modal Conditioning | Wamiq Reyaz Para et.al. | 2402.05803v1 | null |
2024-02-08 | Determining the significance and relative importance of parameters of a simulated quenching algorithm using statistical tools | Pedro A. Castillo et.al. | 2402.05791v1 | null |
2024-02-08 | DiffSpeaker: Speech-Driven 3D Facial Animation with Diffusion Transformer | Zhiyuan Ma et.al. | 2402.05712v1 | link |
2024-02-08 | Scalable Diffusion Models with State Space Backbone | Zhengcong Fei et.al. | 2402.05608v1 | link |
2024-02-07 | On diffusion models for amortized inference: Benchmarking and improving stochastic control and sampling | Marcin Sendera et.al. | 2402.05098v1 | link |
2024-02-07 | NITO: Neural Implicit Fields for Resolution-free Topology Optimization | Amin Heyrani Nobari et.al. | 2402.05073v1 | null |
2024-02-07 | LGM: Large Multi-View Gaussian Model for High-Resolution 3D Content Creation | Jiaxiang Tang et.al. | 2402.05054v1 | null |
2024-02-06 | SHIELD : An Evaluation Benchmark for Face Spoofing and Forgery Detection with Multimodal Large Language Models | Yichen Shi et.al. | 2402.04178v1 | link |
2024-02-06 | Entropy-regularized Diffusion Policy with Q-Ensembles for Offline Reinforcement Learning | Ruoqi Zhang et.al. | 2402.04080v1 | link |
2024-02-06 | Generative Modeling of Graphs via Joint Diffusion of Node and Edge Attributes | Nimrod Berman et.al. | 2402.04046v1 | null |
2024-02-06 | Polyp-DDPM: Diffusion-Based Semantic Polyp Synthesis for Enhanced Segmentation | Zolnamar Dorjsembe et.al. | 2402.04031v1 | link |
2024-02-06 | Space Group Constrained Crystal Generation | Rui Jiao et.al. | 2402.03992v1 | null |
2024-02-06 | Controllable Diverse Sampling for Diffusion Based Motion Behavior Forecasting | Yiming Xu et.al. | 2402.03981v1 | null |
2024-02-06 | Weibel- and non-resonant Whistler wave growth in an expanding plasma in a 1D simulation geometry | M E Dieckmann et.al. | 2402.03925v1 | null |
2024-02-05 | Do Diffusion Models Learn Semantically Meaningful and Efficient Representations? | Qiyao Liang et.al. | 2402.03305v1 | null |
2024-02-05 | Zero-shot Object-Level OOD Detection with Context-Aware Inpainting | Quang-Huy Nguyen et.al. | 2402.03292v1 | null |
2024-02-05 | InstanceDiffusion: Instance-level Control for Image Generation | Xudong Wang et.al. | 2402.03290v1 | link |
2024-02-05 | Organic or Diffused: Can We Distinguish Human Art from AI-generated Images? | Anna Yoo Jeong Ha et.al. | 2402.03214v1 | null |
2024-02-05 | Light and Optimal Schrödinger Bridge Matching | Nikita Gushchin et.al. | 2402.03207v1 | link |
2024-02-05 | Guidance with Spherical Gaussian Constraint for Conditional Diffusion | Lingxiao Yang et.al. | 2402.03201v1 | null |
2024-02-05 | Direct-a-Video: Customized Video Generation with User-Directed Camera Movement and Object Motion | Shiyuan Yang et.al. | 2402.03162v1 | null |
2024-02-05 | DARTS: Diffusion Approximated Residual Time Sampling for Low Variance Time-of-flight Rendering in Homogeneous Scattering Medium | Qianyue He et.al. | 2402.03106v1 | null |
2024-02-02 | NeuroCine: Decoding Vivid Video Sequences from Human Brain Activties | Jingyuan Sun et.al. | 2402.01590v1 | null |
2024-02-02 | Boximator: Generating Rich and Controllable Motions for Video Synthesis | Jiawei Wang et.al. | 2402.01566v1 | null |
2024-02-02 | Low-Resource Cross-Domain Singing Voice Synthesis via Reduced Self-Supervised Speech Representations | Panos Kakoulidis et.al. | 2402.01520v1 | null |
2024-02-02 | Cross-view Masked Diffusion Transformers for Person Image Synthesis | Trung X. Pham et.al. | 2402.01516v1 | null |
2024-02-01 | AToM: Amortized Text-to-Mesh using 2D Diffusion | Guocheng Qian et.al. | 2402.00867v1 | null |
2024-02-01 | ViCA-NeRF: View-Consistency-Aware 3D Editing of Neural Radiance Fields | Jiahua Dong et.al. | 2402.00864v1 | link |
2024-02-01 | Distilling Conditional Diffusion Models for Offline Reinforcement Learning through Trajectory Stitching | Shangzhe Li et.al. | 2402.00807v1 | null |
2024-02-01 | AnimateLCM: Accelerating the Animation of Personalized Diffusion Models and Adapters with Decoupled Consistency Learning | Fu-Yun Wang et.al. | 2402.00769v1 | link |
2024-02-01 | CapHuman: Capture Your Moments in Parallel Universes | Chao Liang et.al. | 2402.00627v1 | link |
2024-02-01 | Diffusion-based Light Field Synthesis | Ruisheng Gao et.al. | 2402.00575v1 | null |
2024-01-31 | Motion Guidance: Diffusion-Based Image Editing with Differentiable Motion Estimators | Daniel Geng et.al. | 2401.18085v1 | null |
2024-01-31 | An electrodynamic wave model for the action potential | Vitaly L. Galinsky et.al. | 2401.18051v1 | null |
2024-01-31 | Investigation of Microstructure and Corrosion Resistance of Ti-Al-V Titanium Alloys Obtained by Spark Plasma Sintering | Aleksey Nokhrin et.al. | 2401.17941v1 | null |
2024-01-31 | AEROBLADE: Training-Free Detection of Latent Diffusion Images Using Autoencoder Reconstruction Error | Jonas Ricker et.al. | 2401.17879v1 | link |
2024-01-30 | You Only Need One Step: Fast Super-Resolution with Stable Diffusion via Scale Distillation | Mehdi Noroozi et.al. | 2401.17258v1 | null |
2024-01-30 | ContactGen: Contact-Guided Interactive 3D Human Generation for Partners | Dongjun Gu et.al. | 2401.17212v1 | null |
2024-01-30 | Transfer Learning for Text Diffusion Models | Kehang Han et.al. | 2401.17181v1 | null |
2024-01-29 | Diffutoon: High-Resolution Editable Toon Shading via Diffusion Models | Zhongjie Duan et.al. | 2401.16224v1 | null |
2024-01-29 | Rapidly rotating radiatively driven convection: experimental and numerical validation of the `geostrophic turbulence’ scaling predictions | Gabriel Hadjerci et.al. | 2401.16200v1 | null |
2024-01-29 | Spatial-Aware Latent Initialization for Controllable Image Generation | Wenqiang Sun et.al. | 2401.16157v1 | null |
2024-01-29 | Acoustic Screens based on Sonic Crystals with high Diffusion properties | M. P. Peiró-Torres et.al. | 2401.16074v1 | null |
2024-01-26 | Annotated Hands for Generative Models | Yue Yang et.al. | 2401.15075v1 | link |
2024-01-26 | Emulating Complex Synapses Using Interlinked Proton Conductors | Lifu Zhang et.al. | 2401.15045v1 | null |
2024-01-26 | DAM: Diffusion Activation Maximization for 3D Global Explanations | Hanxiao Tan et.al. | 2401.14938v1 | link |
2024-01-26 | Social norms and cooperation in higher-order networks | Yin-Jie Ma et.al. | 2401.14905v1 | null |
2024-01-25 | Deconstructing Denoising Diffusion Models for Self-Supervised Learning | Xinlei Chen et.al. | 2401.14404v1 | null |
2024-01-25 | pix2gestalt: Amodal Segmentation by Synthesizing Wholes | Ege Ozguroglu et.al. | 2401.14398v1 | link |
2024-01-25 | Manifold GCN: Diffusion-based Convolutional Neural Network for Manifold-valued Graphs | Martin Hanik et.al. | 2401.14381v1 | null |
2024-01-25 | UrbanGenAI: Reconstructing Urban Landscapes using Panoptic Segmentation and Diffusion Models | Timo Kapsalis et.al. | 2401.14379v1 | null |
2024-01-25 | Modeling Global Surface Dust Deposition Using Physics-Informed Neural Networks | Constanza A. Molina Catricheo et.al. | 2401.14372v1 | link |
2024-01-25 | Sketch2NeRF: Multi-view Sketch-guided Text-to-3D Generation | Minglin Chen et.al. | 2401.14257v1 | null |
2024-01-24 | Bi-Hamiltonian in Semiflexible Polymer as Strongly Coupled System | Heeyuen Koh et.al. | 2401.13655v1 | null |
2024-01-24 | On the self-similarity of unbounded viscous Marangoni flows | Fernando Temprano-Coleto et.al. | 2401.13647v1 | null |
2024-01-24 | Winding Clearness for Differentiable Point Cloud Optimization | Dong Xiao et.al. | 2401.13639v1 | null |
2024-01-24 | Guided Diffusion for Fast Inverse Design of Density-based Mechanical Metamaterials | Yanyan Yang et.al. | 2401.13570v1 | null |
2024-01-24 | Expressive Acoustic Guitar Sound Synthesis with an Instrument-Specific Input Representation and Diffusion Outpainting | Hounsu Kim et.al. | 2401.13498v1 | null |
2024-01-23 | GALA: Generating Animatable Layered Assets from a Single Scan | Taeksoo Kim et.al. | 2401.12979v1 | null |
2024-01-23 | Zero-Shot Learning for the Primitives of 3D Affordance in General Objects | Hyeonwoo Kim et.al. | 2401.12978v1 | null |
2024-01-23 | Lumiere: A Space-Time Diffusion Model for Video Generation | Omer Bar-Tal et.al. | 2401.12945v1 | null |
2024-01-23 | Long-range three-dimensional tracking of nanoparticles using interferometric scattering (iSCAT) microscopy | Kiarash Kasaian et.al. | 2401.12939v1 | null |
2024-01-22 | DITTO: Diffusion Inference-Time T-Optimization for Music Generation | Zachary Novack et.al. | 2401.12179v1 | null |
2024-01-22 | Single-View 3D Human Digitalization with Large Reconstruction Models | Zhenzhen Weng et.al. | 2401.12175v1 | null |
2024-01-22 | Improved accuracy of continuum surface flux models for metal additive manufacturing melt pool simulations | Nils Much et.al. | 2401.12114v1 | null |
2024-01-22 | Experimental investigation and scale analysis on melting of salty ice in a 3D-printed cavity filled with porous media | Xiaotian Liand Yuming Wang et.al. | 2401.12009v1 | null |
2024-01-22 | Claim Detection for Automated Fact-checking: A Survey on Monolingual, Multilingual and Cross-Lingual Research | Rrubaa Panchendrarajan et.al. | 2401.11969v1 | null |
2024-01-22 | Feature Denoising Diffusion Model for Blind Image Quality Assessment | Xudong Li et.al. | 2401.11949v1 | null |
2024-01-19 | Synthesizing Moving People with 3D Control | Boyi Li et.al. | 2401.10889v1 | null |
2024-01-19 | ActAnywhere: Subject-Aware Video Background Generation | Boxiao Pan et.al. | 2401.10822v1 | null |
2024-01-19 | Sat2Scene: 3D Urban Scene Generation from Satellite Images with Diffusion | Zuoyue Li et.al. | 2401.10786v1 | null |
2024-01-19 | Signatures of s-wave scattering in bound electronic states | Robin E. Moorby et.al. | 2401.10714v1 | null |
2024-01-19 | Safe Offline Reinforcement Learning with Feasibility-Guided Diffusion Model | Yinan Zheng et.al. | 2401.10700v1 | link |
2024-01-19 | Refractive index measurement of pharmaceutical powders in the short-wave infrared range using index matching assisted with phase imaging | Cory Juntunen et.al. | 2401.10667v1 | null |
2024-01-19 | Analysis of the Patent of a Protective Cover for Vertical-Axis Wind Turbines (VAWTs): Simulations of Wind Flow | JA Moleón Baca et.al. | 2401.10656v1 | null |
2024-01-18 | A Simple Latent Diffusion Approach for Panoptic Segmentation and Mask Inpainting | Wouter Van Gansbeke et.al. | 2401.10227v1 | link |
2024-01-18 | Towards Language-Driven Video Inpainting via Multimodal Large Language Models | Jianzong Wu et.al. | 2401.10226v1 | null |
2024-01-18 | Motion-Zero: Zero-Shot Moving Object Control Framework for Diffusion-Based Video Generation | Changgu Chen et.al. | 2401.10150v1 | null |
2024-01-18 | DiffusionGPT: LLM-Driven Text-to-Image Generation System | Jie Qin et.al. | 2401.10061v1 | null |
2024-01-18 | CustomVideo: Customizing Text-to-Video Generation with Multiple Subjects | Zhao Wang et.al. | 2401.09962v1 | null |
2024-01-17 | TextureDreamer: Image-guided Texture Synthesis through Geometry-aware Diffusion | Yu-Ying Yeh et.al. | 2401.09416v1 | null |
2024-01-17 | Vlogger: Make Your Dream A Vlog | Shaobin Zhuang et.al. | 2401.09414v1 | link |
2024-01-17 | Siamese Meets Diffusion Network: SMDNet for Enhanced Change Detection in High-Resolution RS Imagery | Jia Jia et.al. | 2401.09325v1 | null |
2024-01-17 | Tailoring chaotic motion of microcavity photons in ray and wave dynamics by tuning the curvature of space | Wei Lin et.al. | 2401.09303v1 | null |
2024-01-17 | T-FOLEY: A Controllable Waveform-Domain Diffusion Model for Temporal-Event-Guided Foley Sound Synthesis | Yoonjin Chung et.al. | 2401.09294v1 | null |
2024-01-16 | Robotic Imitation of Human Actions | Josua Spisak et.al. | 2401.08381v1 | null |
2024-01-16 | Optimization of the plasmonic properties of titanium nitride films sputtered at room temperature through microstructure and thickness control | Mateusz Nieborek et.al. | 2401.08353v1 | null |
2024-01-16 | Modeling Spoof Noise by De-spoofing Diffusion and its Application in Face Anti-spoofing | Bin Zhang et.al. | 2401.08275v1 | null |
2024-01-16 | Multi-scale 2D Temporal Map Diffusion Models for Natural Language Video Localization | Chongzhi Zhang et.al. | 2401.08232v1 | null |
2024-01-12 | Decoupling Pixel Flipping and Occlusion Strategy for Consistent XAI Benchmarks | Stefan Blücher et.al. | 2401.06654v1 | link |
2024-01-12 | Adversarial Examples are Misaligned in Diffusion Model Manifolds | Peter Lorenz et.al. | 2401.06637v1 | null |
2024-01-12 | Motion2VecSets: 4D Latent Vector Set Diffusion for Non-rigid Shape Reconstruction and Tracking | Wei Cao et.al. | 2401.06614v1 | null |
2024-01-12 | 360DVD: Controllable Panorama Video Generation with 360-Degree Video Diffusion Model | Qian Wang et.al. | 2401.06578v1 | null |
2024-01-11 | E $^{2}$ GAN: Efficient Training of Efficient GANs for Image-to-Image Translation | Yifan Gong et.al. | 2401.06127v1 | null |
2024-01-11 | Numerical thermalization in 2D PIC simulations: Practical estimates for low temperature plasma simulations | Sierra Jubin et.al. | 2401.06057v1 | null |
2024-01-11 | DiffDA: a diffusion model for weather-scale data assimilation | Langwen Huang et.al. | 2401.05932v1 | null |
2024-01-11 | Efficient Image Deblurring Networks based on Diffusion Models | Kang Chen et.al. | 2401.05907v1 | link |
2024-01-10 | InseRF: Text-Driven Generative Object Insertion in Neural 3D Scenes | Mohamad Shahbazi et.al. | 2401.05335v1 | null |
2024-01-10 | Score Distillation Sampling with Learned Manifold Corrective | Thiemo Alldieck et.al. | 2401.05293v1 | null |
2024-01-10 | PIXART-δ: Fast and Controllable Image Generation with Latent Consistency Models | Junsong Chen et.al. | 2401.05252v1 | link |
2024-01-10 | Derm-T2IM: Harnessing Synthetic Skin Lesion Data via Stable Diffusion Models for Enhanced Skin Disease Classification using ViT and CNN | Muhammad Ali Farooq et.al. | 2401.05159v1 | null |
2024-01-10 | CrossDiff: Exploring Self-Supervised Representation of Pansharpening via Cross-Predictive Diffusion Model | Yinghui Xing et.al. | 2401.05153v1 | null |
2024-01-09 | Morphable Diffusion: 3D-Consistent Diffusion for Single-image Avatar Creation | Xiyi Chen et.al. | 2401.04728v1 | null |
2024-01-09 | EmoGen: Emotional Image Content Generation with Text-to-Image Diffusion Models | Jingyuan Yang et.al. | 2401.04608v1 | null |
2024-01-09 | Enhanced Distribution Alignment for Post-Training Quantization of Diffusion Models | Xuewen Liu et.al. | 2401.04585v1 | link |
2024-01-09 | MagicVideo-V2: Multi-Stage High-Aesthetic Video Generation | Weimin Wang et.al. | 2401.04468v1 | null |
2024-01-09 | D3AD: Dynamic Denoising Diffusion Probabilistic Model for Anomaly Detection | Justin Tebbe et.al. | 2401.04463v1 | link |
2024-01-08 | D3PRefiner: A Diffusion-based Denoise Method for 3D Human Pose Refinement | Danqi Yan et.al. | 2401.03914v1 | null |
2024-01-05 | Uncovering the human motion pattern: Pattern Memory-based Diffusion Model for Trajectory Prediction | Yuxin Yang et.al. | 2401.02916v1 | null |
2024-01-05 | Plug-in Diffusion Model for Sequential Recommendation | Haokai Ma et.al. | 2401.02913v1 | link |
2024-01-05 | Generating Non-Stationary Textures using Self-Rectification | Yang Zhou et.al. | 2401.02847v1 | link |
2024-01-05 | Diffbody: Diffusion-based Pose and Shape Editing of Human Images | Yuta Okuyama et.al. | 2401.02804v1 | link |
2024-01-05 | Diffusion Variational Inference: Diffusion Models as Expressive Variational Posteriors | Top Piriyakulkij et.al. | 2401.02739v1 | null |
2024-01-05 | Geometric-Facilitated Denoising Diffusion Model for 3D Molecule Generation | Can Xu et.al. | 2401.02683v1 | link |
2024-01-04 | Bring Metric Functions into Diffusion Models | Jie An et.al. | 2401.02414v1 | null |
2024-01-04 | Image denoising and model-independent parameterization for improving IVIM MRI | Caleb Sample et.al. | 2401.02394v1 | null |
2024-01-04 | Integration of physics-informed operator learning and finite element method for parametric learning of partial differential equations | Shahed Rezaei et.al. | 2401.02363v1 | null |
2024-01-04 | Robust Physics Informed Neural Networks | Marcin Łoś et.al. | 2401.02300v1 | null |
2024-01-03 | From Audio to Photoreal Embodiment: Synthesizing Humans in Conversations | Evonne Ng et.al. | 2401.01885v1 | link |
2024-01-03 | DGDNN: Decoupled Graph Diffusion Neural Network for Stock Movement Prediction | Zinuo You et.al. | 2401.01846v1 | link |
2024-01-03 | Moonshot: Towards Controllable Video Generation and Editing with Multimodal Conditions | David Junhao Zhang et.al. | 2401.01827v1 | link |
2024-01-03 | aMUSEd: An Open MUSE Reproduction | Suraj Patil et.al. | 2401.01808v1 | link |
2024-01-03 | Short-time expansion of one-dimensional Fokker-Planck equations with heterogeneous diffusion | Tom Dupont et.al. | 2401.01765v1 | null |
2024-01-02 | Influence of scanning plane on Human Spinal Cord functional Magnetic Resonance echo planar imaging | Marta Moraschi et.al. | 2401.01281v1 | null |
2024-01-02 | Fairness Certification for Natural Language Processing and Large Language Models | Vincent Freiberger et.al. | 2401.01262v1 | null |
2024-01-02 | VideoDrafter: Content-Consistent Multi-Scene Video Generation with LLM | Fuchen Long et.al. | 2401.01256v1 | null |
2024-01-02 | Towards a Simultaneous and Granular Identity-Expression Control in Personalized Face Generation | Renshuai Liu et.al. | 2401.01207v1 | null |
2024-01-02 | Learning Surface Scattering Parameters From SAR Images Using Differentiable Ray Tracing | Jiangtao Wei et.al. | 2401.01175v1 | null |
2024-01-02 | Joint Generative Modeling of Scene Graphs and Images via Diffusion Models | Bicheng Xu et.al. | 2401.01130v1 | null |
2023-12-29 | FlowVid: Taming Imperfect Optical Flows for Consistent Video-to-Video Synthesis | Feng Liang et.al. | 2312.17681v1 | null |
2023-12-29 | Data Augmentation for Supervised Graph Outlier Detection with Latent Diffusion Models | Kay Liu et.al. | 2312.17679v1 | link |
2023-12-29 | Leveraging Open-Vocabulary Diffusion to Camouflaged Instance Segmentation | Tuan-Anh Vu et.al. | 2312.17505v1 | null |
2023-12-28 | iFusion: Inverting Diffusion for Pose-Free Reconstruction from Sparse Views | Chin-Hsuan Wu et.al. | 2312.17250v1 | link |
2023-12-28 | Amodal Ground Truth and Completion in the Wild | Guanqi Zhan et.al. | 2312.17247v1 | link |
2023-12-28 | Personalized Restoration via Dual-Pivot Tuning | Pradyumna Chari et.al. | 2312.17234v1 | null |
2023-12-28 | 4DGen: Grounded 4D Content Generation with Spatial-temporal Consistency | Yuyang Yin et.al. | 2312.17225v1 | null |
2023-12-28 | EFHQ: Multi-purpose ExtremePose-Face-HQ dataset | Trung Tuan Dao et.al. | 2312.17205v1 | null |
2023-12-28 | Restoration by Generation with Constrained Priors | Zheng Ding et.al. | 2312.17161v1 | null |
2023-12-28 | InsActor: Instruction-driven Physics-based Characters | Jiawei Ren et.al. | 2312.17135v1 | null |
2023-12-28 | 100-fold improvement in relaxed eddy accumulation flux estimates through error diffusion | Anas Emad et.al. | 2312.17027v1 | link |
2023-12-26 | One-dimensional Adapter to Rule Them All: Concepts, Diffusion Models and Erasing Applications | Mengyao Lyu et.al. | 2312.16145v1 | null |
2023-12-26 | HarmonyView: Harmonizing Consistency and Diversity in One-Image-to-3D | Sangmin Woo et.al. | 2312.15980v1 | link |
2023-12-26 | Semantic Guidance Tuning for Text-To-Image Diffusion Models | Hyun Kang et.al. | 2312.15964v1 | null |
2023-12-26 | EnchantDance: Unveiling the Potential of Music-Driven Dance Movement | Bo Han et.al. | 2312.15946v1 | link |
2023-12-22 | MACS: Mass Conditioned 3D Hand and Object Motion Synthesis | Soshi Shimada et.al. | 2312.14929v1 | null |
2023-12-22 | BrainVis: Exploring the Bridge between Brain and Visual Signals via Image Reconstruction | Honghao Fu et.al. | 2312.14871v1 | null |
2023-12-22 | Dreaming of Electrical Waves: Generative Modeling of Cardiac Excitation Waves using Diffusion Models | Tanish Baranwal et.al. | 2312.14830v1 | null |
2023-12-22 | Neural network models for preferential concentration of particles in two-dimensional turbulence | Thibault Maurel-Oujia et.al. | 2312.14829v1 | null |
2023-12-22 | Plan, Posture and Go: Towards Open-World Text-to-Motion Generation | Jinpeng Liu et.al. | 2312.14828v1 | null |
2023-12-22 | Disorder-induced non-linear growth of viscously-unstable immiscible two-phase flow fingers in porous media | Santanu Sinha et.al. | 2312.14799v1 | null |
2023-12-22 | Diffusion Maps for Signal Filtering in Graph Learning | Todd Hildebrant et.al. | 2312.14758v1 | null |
2023-12-21 | Diffusion Reward: Learning Rewards via Conditional Video Diffusion | Tao Huang et.al. | 2312.14134v1 | null |
2023-12-21 | Neural Point Cloud Diffusion for Disentangled 3D Shape and Appearance Generation | Philipp Schröppel et.al. | 2312.14124v1 | link |
2023-12-21 | HD-Painter: High-Resolution and Prompt-Faithful Text-Guided Image Inpainting with Diffusion Models | Hayk Manukyan et.al. | 2312.14091v1 | link |
2023-12-21 | Designing Artificial Intelligence Equipped Social Decentralized Autonomous Organizations for Tackling Sextortion Cases Version 0.7 | Norta Alex et.al. | 2312.14090v1 | null |
2023-12-21 | The influence of controlled vibration effects on fluid flow | Alexey Fedyushkin et.al. | 2312.14079v1 | null |
2023-12-21 | Carve3D: Improving Multi-view Reconstruction Consistency for Diffusion Models with RL Finetuning | Desai Xie et.al. | 2312.13980v1 | null |
2023-12-21 | Controllable 3D Face Generation with Conditional Style Code Diffusion | Xiaolong Shen et.al. | 2312.13941v1 | link |
2023-12-20 | Repaint123: Fast and High-quality One Image to 3D Generation with Progressive Controllable 2D Repainting | Junwu Zhang et.al. | 2312.13271v1 | link |
2023-12-20 | Conditional Image Generation with Pretrained Generative Model | Rajesh Shrestha et.al. | 2312.13253v1 | null |
2023-12-20 | Zero-Shot Metric Depth with a Field-of-View Conditioned Diffusion Model | Saurabh Saxena et.al. | 2312.13252v1 | null |
2023-12-20 | Diffusion Models With Learned Adaptive Noise | Subham Sekhar Sahoo et.al. | 2312.13236v1 | link |
2023-12-20 | MoSAR: Monocular Semi-Supervised Model for Avatar Reconstruction using Differentiable Shading | Abdallah Dib et.al. | 2312.13091v1 | null |
2023-12-20 | DiffPortrait3D: Controllable Diffusion for Zero-Shot Portrait View Synthesis | Yuming Gu et.al. | 2312.13016v1 | link |
2023-12-20 | A comparative study of analytical models of diffuse reflectance in homogeneous biological tissues: Gelatin based phantoms and Monte Carlo experiments | Anisha Bahl et.al. | 2312.12935v1 | null |
2023-12-19 | On Inference Stability for Diffusion Models | Viet Nguyen et.al. | 2312.12431v1 | link |
2023-12-19 | SegRefiner: Towards Model-Agnostic Segmentation Refinement with Discrete Diffusion Process | Mengyu Wang et.al. | 2312.12425v1 | link |
2023-12-19 | Scene-Conditional 3D Object Stylization and Composition | Jinghao Zhou et.al. | 2312.12419v1 | null |
2023-12-19 | LASA: Instance Reconstruction from Real Scans using A Large-scale Aligned Shape Annotation Dataset | Haolin Liu et.al. | 2312.12418v1 | null |
2023-12-19 | Prompting Hard or Hardly Prompting: Prompt Inversion for Text-to-Image Diffusion Models | Shweta Mahajan et.al. | 2312.12416v1 | null |
2023-12-19 | Intrinsic Image Diffusion for Single-view Material Estimation | Peter Kocsis et.al. | 2312.12274v1 | link |
2023-12-19 | Brush Your Text: Synthesize Any Scene Text on Images via Diffusion Model | Lingjun Zhang et.al. | 2312.12232v1 | link |
2023-12-18 | A novel diffusion recommendation algorithm based on multi-scale cnn and residual lstm | Yong Niu et.al. | 2312.10885v1 | null |
2023-12-17 | Your Student is Better Than Expected: Adaptive Teacher-Student Collaboration for Text-Conditional Diffusion Models | Nikita Starodubcev et.al. | 2312.10835v1 | link |
2023-12-17 | From mixing to displacement of miscible phases in porous media: The role of heterogeneity and inlet pressure | Yahel Eliyahu-Yakir et.al. | 2312.10722v1 | null |
2023-12-17 | CogCartoon: Towards Practical Story Visualization | Zhongyang Zhu et.al. | 2312.10718v1 | null |
2023-12-17 | A Framework of Full-Process Generation Design for Park Green Spaces Based on Remote Sensing Segmentation-GAN-Diffusion | Ran Chen et.al. | 2312.10674v1 | null |
2023-12-15 | Movement Primitive Diffusion: Learning Gentle Robotic Manipulation of Deformable Objects | Paul Maria Scheikl et.al. | 2312.10008v1 | null |
2023-12-15 | Contributions to the geomagnetic secular variation from a reanalysis of core surface dynamics | Olivier Barrois et.al. | 2312.09942v1 | null |
2023-12-15 | Assimilation of ground and satellite magnetic measurements: inference of core surface magnetic and velocity field changes | Olivier Barrois et.al. | 2312.09878v1 | null |
2023-12-15 | Integrating New Technologies into Science: The case of AI | Stefano Bianchini et.al. | 2312.09843v1 | null |
2023-12-15 | Socio-Economic Deprivation Analysis: Diffusion Maps | June Moh Goo et.al. | 2312.09830v1 | null |
2023-12-15 | Comparison of Quasi-Geostrophic, Hybrid and 3D models of planetary core convection | Olivier Barrois et.al. | 2312.09826v1 | null |
2023-12-15 | Neural networks for turbulent transport prediction in a simplified model of tokamak plasmas | L. M. Pomârjanschi et.al. | 2312.09807v1 | null |
2023-12-14 | LIME: Localized Image Editing via Attention Regularization in Diffusion Models | Enis Simsar et.al. | 2312.09256v1 | null |
2023-12-14 | FineControlNet: Fine-level Text Control for Image Generation with Spatially Aligned Text Control Injection | Hongsuk Choi et.al. | 2312.09252v1 | null |
2023-12-14 | Single Mesh Diffusion Models with Field Latents for Texture Generation | Thomas W. Mitchel et.al. | 2312.09250v1 | null |
2023-12-14 | Text2Immersion: Generative Immersive Scene with 3D Gaussians | Hao Ouyang et.al. | 2312.09242v1 | null |
2023-12-14 | A framework for conditional diffusion modelling with applications in motif scaffolding for protein design | Kieran Didi et.al. | 2312.09236v1 | null |
2023-12-14 | Reliability in Semantic Segmentation: Can We Use Synthetic Data? | Thibaut Loiseau et.al. | 2312.09231v1 | null |
2023-12-14 | Mosaic-SDF for 3D Generative Models | Lior Yariv et.al. | 2312.09222v1 | null |
2023-12-14 | Measurement in the Age of LLMs: An Application to Ideological Scaling | Sean O’Hagan et.al. | 2312.09203v1 | null |
2023-12-14 | Fast Sampling via De-randomization for Discrete Diffusion Models | Zixiang Chen et.al. | 2312.09193v1 | null |
2023-12-13 | PnPNet: Pull-and-Push Networks for Volumetric Segmentation with Boundary Confusion | Xin You et.al. | 2312.08323v1 | link |
2023-12-13 | Black-box Membership Inference Attacks against Fine-tuned Diffusion Models | Yan Pang et.al. | 2312.08207v1 | link |
2023-12-13 | SPD-DDPM: Denoising Diffusion Probabilistic Models in the Symmetric Positive Definite Space | Yunchen Li et.al. | 2312.08200v1 | link |
2023-12-13 | Concept-centric Personalization with Large-scale Diffusion Priors | Pu Cao et.al. | 2312.08195v1 | link |
2023-12-13 | $ρ$ -Diffusion: A diffusion-based density estimation framework for computational physics | Maxwell X. Cai et.al. | 2312.08153v1 | link |
2023-12-13 | Clockwork Diffusion: Efficient Generation With Model-Step Distillation | Amirhossein Habibian et.al. | 2312.08128v1 | link |
2023-12-12 | FreeInit: Bridging Initialization Gap in Video Diffusion Models | Tianxing Wu et.al. | 2312.07537v1 | link |
2023-12-12 | FreeControl: Training-Free Spatial Control of Any Text-to-Image Diffusion Model with Any Condition | Sicheng Mo et.al. | 2312.07536v1 | null |
2023-12-12 | PEEKABOO: Interactive Video Generation via Masked-Diffusion | Yash Jain et.al. | 2312.07509v1 | null |
2023-12-12 | MinD-3D: Reconstruct High-quality 3D objects in Human Brain | Jianxiong Gao et.al. | 2312.07485v1 | null |
2023-12-12 | DiffMorpher: Unleashing the Capability of Diffusion Models for Image Morphing | Kaiwen Zhang et.al. | 2312.07409v1 | null |
2023-12-12 | Boosting Latent Diffusion with Flow Matching | Johannes S. Fischer et.al. | 2312.07360v1 | link |
2023-12-12 | Momentum Particle Maximum Likelihood | Jen Ning Lim et.al. | 2312.07335v1 | null |
2023-12-11 | CAD: Photorealistic 3D Generation via Adversarial Distillation | Ziyu Wan et.al. | 2312.06663v1 | null |
2023-12-11 | Photorealistic Video Generation with Diffusion Models | Agrim Gupta et.al. | 2312.06662v1 | null |
2023-12-11 | UpFusion: Novel View Diffusion from Unposed Sparse View Observations | Bharath Raj Nagoor Kani et.al. | 2312.06661v1 | null |
2023-12-11 | Sherpa3D: Boosting High-Fidelity Text-to-3D Generation via Coarse 3D Prior | Fangfu Liu et.al. | 2312.06655v1 | link |
2023-12-11 | Upscale-A-Video: Temporal-Consistent Diffusion Model for Real-World Video Super-Resolution | Shangchen Zhou et.al. | 2312.06640v1 | null |
2023-12-11 | DiAD: A Diffusion-based Framework for Multi-class Anomaly Detection | Haoyang He et.al. | 2312.06607v1 | link |
2023-12-11 | ControlNet-XS: Designing an Efficient and Effective Architecture for Controlling Text-to-Image Diffusion Models | Denis Zavadski et.al. | 2312.06573v1 | link |
2023-12-11 | HOI-Diff: Text-Driven Synthesis of 3D Human-Object Interactions using Diffusion Models | Xiaogang Peng et.al. | 2312.06553v1 | null |
2023-12-11 | In-situ Synchrotron X-Ray Photoelectron Spectroscopy Study of Medium-Temperature Baking of Niobium for SRF Application | Alena Prudnikava et.al. | 2312.06529v1 | null |
2023-12-08 | KBFormer: A Diffusion Model for Structured Entity Completion | Ouail Kitouni et.al. | 2312.05253v1 | null |
2023-12-08 | SwiftBrush: One-Step Text-to-Image Diffusion Model with Variational Score Distillation | Thuan Hoang Nguyen et.al. | 2312.05239v1 | null |
2023-12-08 | Stoichiometry preservation and generalization of Bilger mixture fraction for non-premixed combustion with differential molecular diffusion | Haifeng Wang et.al. | 2312.05204v1 | null |
2023-12-08 | Membership Inference Attacks on Diffusion Models via Quantile Regression | Shuai Tang et.al. | 2312.05140v1 | null |
2023-12-08 | DreaMoving: A Human Dance Video Generation Framework based on Diffusion Models | Mengyang Feng et.al. | 2312.05107v1 | null |
2023-12-08 | Application of deep learning to the estimation of normalization coefficients in diffusion-based covariance models | Folke K Skrunes et.al. | 2312.05068v1 | link |
2023-12-08 | SmartMask: Context Aware High-Fidelity Mask Generation for Fine-grained Object Insertion and Layout Control | Jaskirat Singh et.al. | 2312.05039v1 | null |
2023-12-08 | Numerical determination of iron dust laminar flame speeds with the counterflow twin-flame technique | C. E. A. G. van Gool et.al. | 2312.04994v1 | null |
2023-12-07 | Gen2Det: Generate to Detect | Saksham Suri et.al. | 2312.04566v1 | null |
2023-12-07 | NeRFiller: Completing Scenes via Generative 3D Inpainting | Ethan Weber et.al. | 2312.04560v1 | null |
2023-12-07 | PrimDiffusion: Volumetric Primitives Diffusion for 3D Human Generation | Zhaoxi Chen et.al. | 2312.04559v1 | link |
2023-12-07 | GenTron: Delving Deep into Diffusion Transformers for Image and Video Generation | Shoufa Chen et.al. | 2312.04557v1 | null |
2023-12-07 | SPIDeRS: Structured Polarization for Invisible Depth and Reflectance Sensing | Tomoki Ichikawa et.al. | 2312.04553v1 | null |
2023-12-07 | Generating Illustrated Instructions | Sachit Menon et.al. | 2312.04552v1 | null |
2023-12-07 | PlayFusion: Skill Acquisition via Diffusion from Language-Annotated Play | Lili Chen et.al. | 2312.04549v1 | null |
2023-12-07 | HyperDreamer: Hyper-Realistic 3D Content Generation and Editing from a Single Image | Tong Wu et.al. | 2312.04543v1 | null |
2023-12-07 | Diffusion Reflectance Map: Single-Image Stochastic Inverse Rendering of Illumination and Reflectance | Yuto Enyo et.al. | 2312.04529v1 | null |
2023-12-07 | RAVE: Randomized Noise Shuffling for Fast and Consistent Video Editing with Diffusion Models | Ozgur Kara et.al. | 2312.04524v1 | link |
2023-12-06 | Relightable Gaussian Codec Avatars | Shunsuke Saito et.al. | 2312.03704v1 | null |
2023-12-06 | Self-conditioned Image Generation via Generating Representations | Tianhong Li et.al. | 2312.03701v1 | link |
2023-12-06 | Memory Triggers: Unveiling Memorization in Text-To-Image Generative Models through Word-Level Duplication | Ali Naseh et.al. | 2312.03692v1 | null |
2023-12-06 | WarpDiffusion: Efficient Diffusion Model for High-Fidelity Virtual Try-on | xujie zhang et.al. | 2312.03667v1 | null |
2023-12-06 | TokenCompose: Grounding Diffusion with Token-level Supervision | Zirui Wang et.al. | 2312.03626v1 | link |
2023-12-06 | DreamComposer: Controllable 3D Object Generation via Multi-View Conditions | Yunhan Yang et.al. | 2312.03611v1 | null |
2023-12-06 | DiffusionSat: A Generative Foundation Model for Satellite Imagery | Samar Khanna et.al. | 2312.03606v1 | null |
2023-12-06 | MMM: Generative Masked Motion Model | Ekkasit Pinyoanuntapong et.al. | 2312.03596v1 | link |
2023-12-05 | ReconFusion: 3D Reconstruction with Diffusion Priors | Rundi Wu et.al. | 2312.02981v1 | null |
2023-12-05 | Alchemist: Parametric Control of Material Properties with Diffusion Models | Prafull Sharma et.al. | 2312.02970v1 | null |
2023-12-05 | AmbiGen: Generating Ambigrams from Pre-trained Diffusion Model | Boheng Zhao et.al. | 2312.02967v1 | null |
2023-12-05 | Diffusion-SS3D: Diffusion Model for Semi-supervised 3D Object Detection | Cheng-Ju Ho et.al. | 2312.02966v1 | link |
2023-12-05 | Drag-A-Video: Non-rigid Video Editing with Point-based Interaction | Yao Teng et.al. | 2312.02936v1 | null |
2023-12-05 | WoVoGen: World Volume-aware Diffusion for Controllable Multi-camera Driving Scene Generation | Jiachen Lu et.al. | 2312.02934v1 | link |
2023-12-05 | LivePhoto: Real Image Animation with Text-guided Motion Control | Xi Chen et.al. | 2312.02928v1 | null |
2023-12-05 | Multimodal Prompt Perceiver: Empower Adaptiveness, Generalizability and Fidelity for All-in-One Image Restoration | Yuang Ai et.al. | 2312.02918v1 | null |
2023-12-04 | Latent Feature-Guided Diffusion Models for Shadow Removal | Kangfu Mei et.al. | 2312.02156v1 | null |
2023-12-04 | Readout Guidance: Learning Control from Diffusion Features | Grace Luo et.al. | 2312.02150v1 | null |
2023-12-04 | Generative Powers of Ten | Xiaojuan Wang et.al. | 2312.02149v1 | null |
2023-12-04 | Repurposing Diffusion-Based Image Generators for Monocular Depth Estimation | Bingxin Ke et.al. | 2312.02145v1 | link |
2023-12-04 | DiffiT: Diffusion Vision Transformers for Image Generation | Ali Hatamizadeh et.al. | 2312.02139v1 | link |
2023-12-04 | Style Aligned Image Generation via Shared Attention | Amir Hertz et.al. | 2312.02133v1 | link |
2023-12-04 | VideoSwap: Customized Video Subject Swapping with Interactive Semantic Point Correspondence | Yuchao Gu et.al. | 2312.02087v1 | null |
2023-12-04 | Computational Investigation on Collective Dynamical Behaviors of Flickering Laminar Buoyant Diffusion Flames in Circular Arrays | Tao Yang et.al. | 2312.02018v1 | null |
2023-12-01 | MorpheuS: Neural Dynamic 360° Surface Reconstruction from Monocular RGB-D Video | Hengyi Wang et.al. | 2312.00778v1 | null |
2023-12-01 | VideoBooth: Diffusion-based Video Generation with Image Prompts | Yuming Jiang et.al. | 2312.00777v1 | null |
2023-12-01 | CompuCell3D Model of Cell Migration Reproduces Chemotaxis | Pedro C. Dal-Castel et.al. | 2312.00776v1 | link |
2023-12-01 | Effects of three-dimensional slit geometry on flashback of premixed hydrogen flames in perforated burners | Filippo Fruzza et.al. | 2312.00744v1 | null |
2023-12-01 | Resource-constrained knowledge diffusion processes inspired by human peer learning | Ehsan Beikihassan et.al. | 2312.00660v1 | null |
2023-12-01 | TrackDiffusion: Multi-object Tracking Data Generation via Diffusion Models | Pengxiang Li et.al. | 2312.00651v1 | null |
2023-12-01 | How the zebra got its stripes: Curvature-dependent diffusion orients Turing patterns on 3D surfaces | Michael F. Staddon et.al. | 2312.00637v1 | null |
2023-11-30 | VIDiff: Translating Videos via Multi-Modal Instructions with Diffusion Models | Zhen Xing et.al. | 2311.18837v1 | null |
2023-11-30 | ART $\boldsymbol{\cdot}$ V: Auto-Regressive Text-to-Video Generation with Diffusion Models | Wenming Weng et.al. | 2311.18834v1 | null |
2023-11-30 | Exploiting Diffusion Prior for Generalizable Pixel-Level Semantic Prediction | Hsin-Ying Lee et.al. | 2311.18832v1 | link |
2023-11-30 | MotionEditor: Editing Video Motion via Content-Aware Diffusion | Shuyuan Tu et.al. | 2311.18830v1 | link |
2023-11-30 | MicroCinema: A Divide-and-Conquer Approach for Text-to-Video Generation | Yanhui Wang et.al. | 2311.18829v1 | null |
2023-11-30 | One-step Diffusion with Distribution Matching Distillation | Tianwei Yin et.al. | 2311.18828v1 | null |
2023-11-30 | ElasticDiffusion: Training-free Arbitrary Size Image Generation | Moayed Haji-Ali et.al. | 2311.18822v1 | link |
2023-11-30 | Continual Diffusion with STAMINA: STack-And-Mask INcremental Adapters | James Seale Smith et.al. | 2311.18763v1 | null |
2023-11-29 | Do text-free diffusion models learn discriminative visual representations? | Soumik Mukhopadhyay et.al. | 2311.17921v1 | link |
2023-11-29 | Visual Anagrams: Generating Multi-View Optical Illusions with Diffusion Models | Daniel Geng et.al. | 2311.17919v1 | null |
2023-11-29 | AvatarStudio: High-fidelity and Animatable 3D Avatar Creation from Text | Jianfeng Zhang et.al. | 2311.17917v1 | null |
2023-11-29 | CG3D: Compositional Generation for Text-to-3D via Gaussian Splatting | Alexander Vilesov et.al. | 2311.17907v1 | null |
2023-11-29 | SODA: Bottleneck Diffusion Models for Representation Learning | Drew A. Hudson et.al. | 2311.17901v1 | null |
2023-11-29 | Leveraging Graph Diffusion Models for Network Refinement Tasks | Puja Trivedi et.al. | 2311.17856v1 | null |
2023-11-29 | SPiC-E : Structural Priors in 3D Diffusion Models using Cross Entity Attention | Etai Sella et.al. | 2311.17834v1 | null |
2023-11-29 | Analyzing and Explaining Image Classifiers via Diffusion Guidance | Maximilian Augustin et.al. | 2311.17833v1 | null |
2023-11-28 | Material Palette: Extraction of Materials from a Single Image | Ivan Lopes et.al. | 2311.17060v1 | null |
2023-11-28 | ReMoS: Reactive 3D Motion Synthesis for Two-Person Interactions | Anindita Ghosh et.al. | 2311.17057v1 | null |
2023-11-28 | DiffuseBot: Breeding Soft Robots With Physics-Augmented Generative Diffusion Models | Tsun-Hsuan Wang et.al. | 2311.17053v1 | null |
2023-11-28 | Surf-D: High-Quality Surface Generation for Arbitrary Topologies using Diffusion Models | Zhengming Yu et.al. | 2311.17050v1 | null |
2023-11-28 | Adversarial Diffusion Distillation | Axel Sauer et.al. | 2311.17042v1 | link |
2023-11-28 | Rumors with Changing Credibility | Charlotte Out et.al. | 2311.17040v1 | null |
2023-11-28 | Diffusion 3D Features (Diff3F): Decorating Untextured Shapes with Distilled Semantic Features | Niladri Shekhar Dutt et.al. | 2311.17024v1 | link |
2023-11-28 | Space-Time Diffusion Features for Zero-Shot Text-Driven Motion Transfer | Danah Yatim et.al. | 2311.17009v1 | null |
2023-11-28 | Ranni: Taming Text-to-Image Diffusion for Accurate Instruction Following | Yutong Feng et.al. | 2311.17002v1 | null |
2023-11-27 | Test-time Adaptation of Discriminative Models via Diffusion Generative Feedback | Mihir Prabhudesai et.al. | 2311.16102v1 | null |
2023-11-27 | CG-HOI: Contact-Guided 3D Human-Object Interaction Generation | Christian Diller et.al. | 2311.16097v1 | null |
2023-11-27 | Street TryOn: Learning In-the-Wild Virtual Try-On from Unpaired Person Images | Aiyu Cui et.al. | 2311.16094v1 | null |
2023-11-27 | Self-correcting LLM-controlled Diffusion Models | Tsung-Han Wu et.al. | 2311.16090v1 | null |
2023-11-27 | DiffSLVA: Harnessing Diffusion Models for Sign Language Video Anonymization | Zhaoyang Xia et.al. | 2311.16060v1 | link |
2023-11-27 | Exploring Attribute Variations in Style-based GANs using Diffusion Models | Rishubh Parihar et.al. | 2311.16052v1 | null |
2023-11-27 | GaussianEditor: Editing 3D Gaussians Delicately with Text Instructions | Jiemin Fang et.al. | 2311.16037v1 | null |
2023-11-27 | Closing the ODE-SDE gap in score-based diffusion models through the Fokker-Planck equation | Teo Deveney et.al. | 2311.15996v1 | null |
2023-11-27 | DiffAnt: Diffusion Models for Action Anticipation | Zeyun Zhong et.al. | 2311.15991v1 | null |
2023-11-24 | CatVersion: Concatenating Embeddings for Diffusion-Based Text-to-Image Personalization | Ruoyu Zhao et.al. | 2311.14631v1 | null |
2023-11-24 | Received Signal and Channel Parameter Estimation in Molecular Communications | O. Tansel Baydas et.al. | 2311.14621v1 | null |
2023-11-24 | Animate124: Animating One Image to 4D Dynamic Scene | Yuyang Zhao et.al. | 2311.14603v1 | null |
2023-11-24 | On the thermodynamic invariance of fine-grain and coarse-grain fluid models | Thomas Dubos et.al. | 2311.14564v1 | null |
2023-11-24 | ToddlerDiffusion: Flash Interpretable Controllable Diffusion Model | Eslam Mohamed Bakr et.al. | 2311.14542v1 | null |
2023-11-24 | GaussianEditor: Swift and Controllable 3D Editing with Gaussian Splatting | Yiwen Chen et.al. | 2311.14521v1 | link |
2023-11-24 | MVControl: Adding Conditional Control to Multi-view Diffusion for Controllable Text-to-3D Generation | Zhiqi Li et.al. | 2311.14494v1 | link |
2023-11-22 | On diffusion-based generative models and their error bounds: The log-concave case with full convergence estimates | Stefano Bruno et.al. | 2311.13584v1 | null |
2023-11-22 | WildFusion: Learning 3D-Aware Latent Diffusion Models in View Space | Katja Schwarz et.al. | 2311.13570v1 | null |
2023-11-22 | ADriver-I: A General World Model for Autonomous Driving | Fan Jia et.al. | 2311.13549v1 | null |
2023-11-22 | DiffusionMat: Alpha Matting as Sequential Refinement Learning | Yangyang Xu et.al. | 2311.13535v1 | null |
2023-11-22 | Guided Flows for Generative Modeling and Decision Making | Qinqing Zheng et.al. | 2311.13443v1 | null |
2023-11-22 | LucidDreamer: Domain-free Generation of 3D Gaussian Splatting Scenes | Jaeyoung Chung et.al. | 2311.13384v1 | null |
2023-11-21 | Bubble departure and sliding in high-pressure flow boiling of water | Artyom Kossolapov et.al. | 2311.12749v1 | null |
2023-11-21 | GPT4Motion: Scripting Physical Motions in Text-to-Video Generation via Blender-Oriented GPT Planning | Jiaxi Lv et.al. | 2311.12631v1 | null |
2023-11-21 | HierSpeech++: Bridging the Gap between Semantic and Acoustic Representation of Speech by Hierarchical Variational Inference for Zero-shot Speech Synthesis | Sang-Hoon Lee et.al. | 2311.12454v1 | link |
2023-11-21 | Stable Diffusion For Aerial Object Detection | Yanan Jian et.al. | 2311.12345v1 | null |
2023-11-21 | LoCo: Locally Constrained Training-Free Layout-to-Image Synthesis | Peiang Zhao et.al. | 2311.12342v1 | null |
2023-11-21 | Overcoming Pathology Image Data Deficiency: Generating Images from Pathological Transformation Process | Zeyu Liu et.al. | 2311.12316v1 | link |
2023-11-20 | Macroscopic description of a heavy particle immersed within a flow of light particles | Radek Erban et.al. | 2311.12021v1 | null |
2023-11-20 | An Image is Worth Multiple Words: Multi-attribute Inversion for Constrained Text-to-Image Synthesis | Aishwarya Agarwal et.al. | 2311.11919v1 | null |
2023-11-20 | Evolution of internal gravity waves in meso-scale eddies | Pablo Sebastia Saez et.al. | 2311.11916v1 | null |
2023-11-20 | Log-periodic oscillations as real-time signatures of hierarchical dynamics in proteins | Emanuel Dorbath et.al. | 2311.11839v1 | null |
2023-11-20 | Holistic Inverse Rendering of Complex Facade via Aerial 3D Scanning | Zixuan Xie et.al. | 2311.11825v1 | null |
2023-11-17 | Emu Video: Factorizing Text-to-Video Generation by Explicit Image Conditioning | Rohit Girdhar et.al. | 2311.10709v1 | null |
2023-11-17 | SelfEval: Leveraging the discriminative nature of generative models for evaluation | Sai Saketh Rambhatla et.al. | 2311.10708v1 | null |
2023-11-17 | Enhancing Object Coherence in Layout-to-Image Synthesis | Yibin Wang et.al. | 2311.10522v1 | link |
2023-11-16 | The Chosen One: Consistent Characters in Text-to-Image Diffusion Models | Omri Avrahami et.al. | 2311.10093v1 | null |
2023-11-16 | Spontaneous Opinion Swings in the Voter Model with Latency | Giovanni Palermo et.al. | 2311.10045v1 | null |
2023-11-16 | TransFusion – A Transparency-Based Diffusion Model for Anomaly Detection | Matic Fučka et.al. | 2311.09999v1 | null |
2023-11-16 | The divergence-free velocity formulation of the consistent Navier-Stokes Cahn-Hilliard model with non-matching densities, divergence-conforming discretization, and benchmarks | M. ten Eikelder et.al. | 2311.09966v1 | null |
2023-11-16 | DSR-Diff: Depth Map Super-Resolution with Diffusion Model | Yuan Shi et.al. | 2311.09919v1 | null |
2023-11-15 | Single-Image 3D Human Digitization with Shape-Guided Diffusion | Badour AlBahar et.al. | 2311.09221v1 | null |
2023-11-15 | DMV3D: Denoising Multi-View Diffusion using 3D Large Reconstruction Model | Yinghao Xu et.al. | 2311.09217v1 | null |
2023-11-15 | Finding polarised communities and tracking information diffusion on Twitter: The Irish Abortion Referendum | Caroline Pena et.al. | 2311.09196v1 | null |
2023-11-15 | Fast Detection of Phase Transitions with Multi-Task Learning-by-Confusion | Julian Arnold et.al. | 2311.09128v1 | null |
2023-11-15 | Contrastive Transformer Learning with Proximity Data Generation for Text-Based Person Search | Hefeng Wu et.al. | 2311.09084v1 | link |
2023-11-15 | A Spectral Diffusion Prior for Hyperspectral Image Super-Resolution | Jianjun Liu et.al. | 2311.08955v1 | null |
2023-11-13 | Fast and Space-Efficient Parallel Algorithms for Influence Maximization | Letong Wang et.al. | 2311.07554v1 | link |
2023-11-13 | Harnessing elastic instabilities for enhanced mixing and reaction kinetics in porous media | Christopher A. Browne et.al. | 2311.07431v1 | link |
2023-11-13 | Robust semi-supervised segmentation with timestep ensembling diffusion models | Margherita Rosnati et.al. | 2311.07421v1 | null |
2023-11-10 | Parameter-Efficient Orthogonal Finetuning via Butterfly Factorization | Weiyang Liu et.al. | 2311.06243v1 | null |
2023-11-10 | Diffusion Models for Earth Observation Use-cases: from cloud removal to urban change detection | Fulvio Sanguigni et.al. | 2311.06222v1 | null |
2023-11-10 | Instant3D: Fast Text-to-3D with Sparse-View Generation and Large Reconstruction Model | Jiahao Li et.al. | 2311.06214v1 | null |
2023-11-10 | Turbulence Scaling from Deep Learning Diffusion Generative Models | Tim Whittaker et.al. | 2311.06112v1 | null |
2023-11-09 | Diffusion-Generative Multi-Fidelity Learning for Physical Simulation | Zheng Wang et.al. | 2311.05606v1 | null |
2023-11-09 | Bayesian Methods for Media Mix Modelling with shape and funnel effects | Javier Marin et.al. | 2311.05587v1 | null |
2023-11-09 | LCM-LoRA: A Universal Stable-Diffusion Acceleration Module | Simian Luo et.al. | 2311.05556v1 | link |
2023-11-09 | From Stability to Change: The Potential Application of Bifurcation Theory to Opinion Dynamics Considerations | Yasuko Kawahata et.al. | 2311.05488v1 | null |
2023-11-09 | Lithium-ion battery performance model including solvent segregation effects | Ruihe Li et.al. | 2311.05467v1 | null |
2023-11-09 | 3DStyle-Diffusion: Pursuing Fine-grained Text-driven 3D Stylization with 2D Diffusion Models | Haibo Yang et.al. | 2311.05464v1 | link |
2023-11-09 | ControlStyle: Text-Driven Stylized Image Generation Using Diffusion Priors | Jingwen Chen et.al. | 2311.05463v1 | null |
2023-11-08 | Transferability of atomic energies from alchemical decomposition | Michael J. Sahre et.al. | 2311.04784v1 | link |
2023-11-08 | Weakly-supervised deepfake localization in diffusion-generated images | Dragos Tantaru et.al. | 2311.04584v1 | link |
2023-11-07 | I2VGen-XL: High-Quality Image-to-Video Synthesis via Cascaded Diffusion Models | Shiwei Zhang et.al. | 2311.04145v1 | link |
2023-11-07 | Simple Bundles of Complex Networks | Alexandre Benatti et.al. | 2311.04133v1 | null |
2023-11-07 | Generative Structural Design Integrating BIM and Diffusion Model | Zhili He et.al. | 2311.04052v1 | link |
2023-11-07 | A Method to Improve the Performance of Reinforcement Learning Based on the Y Operator for a Class of Stochastic Differential Equation-Based Child-Mother Systems | Cheng Yin et.al. | 2311.04014v1 | null |
2023-11-06 | TS-Diffusion: Generating Highly Complex Time Series with Diffusion Models | Yangming Li et.al. | 2311.03303v1 | null |
2023-11-06 | LDM3D-VR: Latent Diffusion Model for 3D VR | Gabriela Ben Melech Stan et.al. | 2311.03226v1 | null |
2023-11-06 | Persistent homology for high-dimensional data based on spectral methods | Sebastian Damrich et.al. | 2311.03087v1 | link |
2023-11-06 | AnyText: Multilingual Visual Text Generation And Editing | Yuxiang Tuo et.al. | 2311.03054v1 | link |
2023-11-03 | Latent Diffusion Model for Conditional Reservoir Facies Generation | Daesoo Lee et.al. | 2311.01968v1 | null |
2023-11-03 | DiffDub: Person-generic Visual Dubbing Using Inpainting Renderer with Diffusion Auto-encoder | Tao Liu et.al. | 2311.01811v1 | null |
2023-11-03 | On the Generalization Properties of Diffusion Models | Puheng Li et.al. | 2311.01797v1 | link |
2023-11-03 | PDF: Point Diffusion Implicit Function for Large-scale Scene Neural Representation | Yuhan Ding et.al. | 2311.01773v1 | null |
2023-11-03 | CDGraph: Dual Conditional Social Graph Synthesizing via Diffusion Model | Jui-Yi Tsai et.al. | 2311.01729v1 | null |
2023-11-02 | Time Series Anomaly Detection using Diffusion-based Models | Ioana Pintilie et.al. | 2311.01452v1 | link |
2023-11-02 | Constrained-Context Conditional Diffusion Models for Imitation Learning | Vaibhav Saxena et.al. | 2311.01419v1 | null |
2023-11-02 | The Blessing of Randomness: SDE Beats ODE in General Diffusion-based Image Editing | Shen Nie et.al. | 2311.01410v1 | null |
2023-11-02 | Sim2Real Bilevel Adaptation for Object Surface Classification using Vision-Based Tactile Sensors | Gabriele M. Caddeo et.al. | 2311.01380v1 | link |
2023-11-02 | DP-Mix: Mixup-based Data Augmentation for Differentially Private Learning | Wenxuan Bao et.al. | 2311.01295v1 | link |
2023-11-02 | Unraveling Diffusion in Fusion Plasma: A Case Study of In Situ Processing and Particle Sorting | Junmin Gu et.al. | 2311.01288v1 | null |
2023-11-01 | De-Diffusion Makes Text a Strong Cross-Modal Interface | Chen Wei et.al. | 2311.00618v1 | null |
2023-11-01 | Controllable Music Production with Diffusion Models and Guidance Gradients | Mark Levy et.al. | 2311.00613v1 | null |
2023-11-01 | Intriguing Properties of Data Attribution on Diffusion Models | Xiaosen Zheng et.al. | 2311.00500v1 | link |
2023-11-01 | Diffusion models for probabilistic programming | Simon Dirmeier et.al. | 2311.00474v1 | link |
2023-11-01 | Dual Conditioned Diffusion Models for Out-Of-Distribution Detection: Application to Fetal Ultrasound Videos | Divyanshu Mishra et.al. | 2311.00469v1 | null |
2023-10-31 | SEINE: Short-to-Long Video Diffusion Model for Generative Transition and Prediction | Xinyuan Chen et.al. | 2310.20700v1 | null |
2023-10-31 | Diffusion Reconstruction of Ultrasound Images with Informative Uncertainty | Yuxin Zhang et.al. | 2310.20618v1 | null |
2023-10-29 | JEN-1 Composer: A Unified Framework for High-Fidelity Multi-Track Music Generation | Yao Yao et.al. | 2310.19180v1 | null |
2023-10-29 | Learning to Follow Object-Centric Image Editing Instructions Faithfully | Tuhin Chakrabarty et.al. | 2310.19145v1 | link |
2023-10-29 | Backward and Forward Inference in Interacting Independent-Cascade Processes: A Scalable and Convergent Message-Passing Approach | Nouman Khan et.al. | 2310.19138v1 | null |
2023-10-29 | Bespoke Solvers for Generative Flow Models | Neta Shaul et.al. | 2310.19075v1 | null |
2023-10-29 | Controllable Group Choreography using Contrastive Diffusion | Nhat Le et.al. | 2310.18986v1 | null |
2023-10-29 | Adversarial Examples Are Not Real Features | Ang Li et.al. | 2310.18936v1 | link |
2023-10-27 | Gen2Sim: Scaling up Robot Learning in Simulation with Generative Models | Pushkal Katara et.al. | 2310.18308v1 | null |
2023-10-27 | Unsteady evolution of slip and drag in surfactant-contaminated superhydrophobic channels | Samuel D. Tomlinson et.al. | 2310.18184v1 | null |
2023-10-27 | Style Description based Text-to-Speech with Conditional Prosodic Layer Normalization based Diffusion GAN | Neeraj Kumar et.al. | 2310.18169v1 | null |
2023-10-27 | Lost in Translation – Multilingual Misinformation and its Evolution | Dorian Quelle et.al. | 2310.18089v1 | null |
2023-10-27 | ZeroNVS: Zero-Shot 360-Degree View Synthesis from a Single Real Image | Kyle Sargent et.al. | 2310.17994v1 | null |
2023-10-26 | 6-DoF Stability Field via Diffusion Models | Takuma Yoneda et.al. | 2310.17649v1 | null |
2023-10-26 | Generative Fractional Diffusion Models | Gabriel Nobis et.al. | 2310.17638v1 | null |
2023-10-26 | Orbital-optimized Density Functional Calculations of Molecular Rydberg Excited States with Real Space Grid Representation and Self-Interaction Correction | Alec E. Sigurðarson et.al. | 2310.17605v1 | null |
2023-10-26 | Noise-Free Score Distillation | Oren Katzir et.al. | 2310.17590v1 | null |
2023-10-27 | Global Structure-Aware Diffusion Process for Low-Light Image Enhancement | Jinhui Hou et.al. | 2310.17577v2 | link |
2023-10-26 | DiffS2UT: A Semantic Preserving Diffusion Model for Textless Direct Speech-to-Speech Translation | Yongxin Zhu et.al. | 2310.17570v1 | null |
2023-10-26 | SD4Match: Learning to Prompt Stable Diffusion Model for Semantic Matching | Xinghui Li et.al. | 2310.17569v1 | null |
2023-10-27 | The Expressive Power of Low-Rank Adaptation | Yuchen Zeng et.al. | 2310.17513v2 | link |
2023-10-25 | PERF: Panoramic Neural Radiance Field from a Single Panorama | Guangcong Wang et.al. | 2310.16831v1 | link |
2023-10-25 | CommonCanvas: An Open Diffusion Model Trained with Creative-Commons Images | Aaron Gokaslan et.al. | 2310.16825v1 | link |
2023-10-26 | DreamCraft3D: Hierarchical 3D Generation with Bootstrapped Diffusion Prior | Jingxiang Sun et.al. | 2310.16818v2 | link |
2023-10-25 | Optical Kinetic Theory of Nonlinear Multi-mode Photonic Networks | Arkady Kurnosov et.al. | 2310.16784v1 | null |
2023-10-25 | Kiki or Bouba? Sound Symbolism in Vision-and-Language Models | Morris Alper et.al. | 2310.16781v1 | null |
2023-10-25 | Multi-scale Diffusion Denoised Smoothing | Jongheon Jeong et.al. | 2310.16779v1 | link |
2023-10-25 | Discrete variance decay analysis of spurious mixing | Tridib Banerjee et.al. | 2310.16768v1 | null |
2023-10-25 | Scalar mass conservation in turbulent mixture fraction based combustion models through consistent local flow parameters | Marco Davidovic et.al. | 2310.16743v1 | null |
2023-10-24 | From Posterior Sampling to Meaningful Diversity in Image Restoration | Noa Cohen et.al. | 2310.16047v1 | null |
2023-10-24 | CVPR 2023 Text Guided Video Editing Competition | Jay Zhangjie Wu et.al. | 2310.16003v1 | link |
2023-10-24 | Classical wave-particle localization in disordered landscapes | Abel J. Abraham et.al. | 2310.16000v1 | null |
2023-10-25 | Improving Robustness and Reliability in Medical Image Classification with Latent-Guided Diffusion and Nested-Ensembles | Xing Shen et.al. | 2310.15952v2 | null |
2023-10-24 | Language-driven Scene Synthesis using Multi-conditional Diffusion Model | An Vuong et.al. | 2310.15948v1 | link |
2023-10-23 | FreeNoise: Tuning-Free Longer Video Diffusion Via Noise Rescheduling | Haonan Qiu et.al. | 2310.15169v1 | link |
2023-10-23 | Matryoshka Diffusion Models | Jiatao Gu et.al. | 2310.15111v1 | null |
2023-10-23 | Zero123++: a Single Image to Consistent Multi-view Diffusion Base Model | Ruoxi Shi et.al. | 2310.15110v1 | link |
2023-10-24 | Wonder3D: Single Image to 3D using Cross-Domain Diffusion | Xiaoxiao Long et.al. | 2310.15008v2 | null |
2023-10-23 | Orientation-Aware Leg Movement Learning for Action-Driven Human Motion Prediction | Chunzhi Gu et.al. | 2310.14907v1 | null |
2023-10-20 | Achieving Single-Electron Sensitivity at Enhanced Speed in Fully-Depleted CCDs with Double-Gate MOSFETs | Miguel Sofo-Haro et.al. | 2310.13644v1 | null |
2023-10-20 | ScaleLong: Towards More Stable Training of Diffusion Model via Scaling Network Long Skip Connection | Zhongzhan Huang et.al. | 2310.13545v1 | link |
2023-10-20 | A Critical Insight into Pretransitional Behavior and Dielectric Tunability of Relaxor Ceramics | Sylwester J. Rzoska et.al. | 2310.13326v1 | null |
2023-10-19 | Variational Inference for SDEs Driven by Fractional Noise | Rembert Daems et.al. | 2310.12975v1 | null |
2023-10-19 | A Markovian dynamics for $C. elegans$ behavior across scales | Antonio C. Costa et.al. | 2310.12883v1 | link |
2023-10-19 | EMIT-Diff: Enhancing Medical Image Segmentation via Text-Guided Diffusion Model | Zheyuan Zhang et.al. | 2310.12868v1 | null |
2023-10-19 | An effective theory of collective deep learning | Lluís Arola-Fernández et.al. | 2310.12802v1 | link |
2023-10-19 | Energy-Based Models For Speech Synthesis | Wanli Sun et.al. | 2310.12765v1 | null |
2023-10-18 | Object-aware Inversion and Reassembly for Image Editing | Zhen Yang et.al. | 2310.12149v1 | null |
2023-10-18 | Quality Diversity through Human Feedback | Li Ding et.al. | 2310.12103v1 | link |
2023-10-18 | Image Super-resolution Via Latent Diffusion: A Sampling-space Mixture Of Experts And Frequency-augmented Decoder Approach | Feng Luo et.al. | 2310.12004v1 | link |
2023-10-18 | Bayesian Flow Networks in Continual Learning | Mateusz Pyla et.al. | 2310.12001v1 | null |
2023-10-18 | InfoDiffusion: Information Entropy Aware Diffusion Process for Non-Autoregressive Text Generation | Renzhi Wang et.al. | 2310.11976v1 | link |
2023-10-17 | Elucidating The Design Space of Classifier-Guided Diffusion Generation | Jiajun Ma et.al. | 2310.11311v1 | link |
2023-10-17 | Favorable and unfavorable many-body interactions for near-field radiative heat transfer in nanoparticle networks | Minggang Luo et.al. | 2310.11273v1 | null |
2023-10-17 | A diffusive wetting model for water entry/exit based on the weakly-compressible SPH method | Shuoguo Zhang et.al. | 2310.11179v1 | null |
2023-10-17 | Leveraging Content-based Features from Multiple Acoustic Models for Singing Voice Conversion | Xueyao Zhang et.al. | 2310.11160v1 | link |
2023-10-17 | BayesDiff: Estimating Pixel-wise Uncertainty in Diffusion via Bayesian Inference | Siqi Kou et.al. | 2310.11142v1 | link |
2023-10-17 | 3D Structure-guided Network for Tooth Alignment in 2D Photograph | Yulong Dou et.al. | 2310.11106v1 | link |
2023-10-16 | A Survey on Video Diffusion Models | Zhen Xing et.al. | 2310.10647v1 | link |
2023-10-16 | TOSS:High-quality Text-guided Novel View Synthesis from a Single Image | Yukai Shi et.al. | 2310.10644v1 | null |
2023-10-16 | LLM Blueprint: Enabling Text-to-Image Generation with Complex and Detailed Prompts | Hanan Gani et.al. | 2310.10640v1 | link |
2023-10-16 | Zero-Shot Robotic Manipulation with Pretrained Image-Editing Diffusion Models | Kevin Black et.al. | 2310.10639v1 | link |
2023-10-16 | DynVideo-E: Harnessing Dynamic NeRF for Large-Scale Motion- and View-Change Human-Centric Video Editing | Jia-Wei Liu et.al. | 2310.10624v1 | null |
2023-10-16 | ViPE: Visualise Pretty-much Everything | Hassan Shahmohammadi et.al. | 2310.10543v1 | link |
2023-10-13 | Hypernymy Understanding Evaluation of Text-to-Image Models via WordNet Hierarchy | Anton Baryshnikov et.al. | 2310.09247v1 | link |
2023-10-13 | Unseen Image Synthesis with Diffusion Models | Ye Zhu et.al. | 2310.09213v1 | null |
2023-10-13 | The effect of solar wind on the charged particles’ diffusion coefficients | J. F. Wang et.al. | 2310.09211v1 | null |
2023-10-12 | OmniControl: Control Any Joint at Any Time for Human Motion Generation | Yiming Xie et.al. | 2310.08580v1 | link |
2023-10-12 | HyperHuman: Hyper-Realistic Human Generation with Latent Structural Diffusion | Xian Liu et.al. | 2310.08579v1 | null |
2023-10-12 | NetDiffusion: Network Data Augmentation Through Protocol-Constrained Traffic Generation | Xi Jiang et.al. | 2310.08543v1 | null |
2023-10-12 | GaussianDreamer: Fast Generation from Text to 3D Gaussian Splatting with Point Cloud Priors | Taoran Yi et.al. | 2310.08529v1 | link |
2023-10-12 | MotionDirector: Motion Customization of Text-to-Video Diffusion Models | Rui Zhao et.al. | 2310.08465v1 | link |
2023-10-12 | Debias the Training of Diffusion Models | Hu Yu et.al. | 2310.08442v1 | null |
2023-10-12 | Neural Diffusion Models | Grigory Bartosh et.al. | 2310.08337v1 | null |
2023-10-11 | ScaleCrafter: Tuning-free Higher-Resolution Visual Generation with Diffusion Models | Yingqing He et.al. | 2310.07702v1 | link |
2023-10-11 | ConditionVideo: Training-Free Condition-Guided Text-to-Video Generation | Bo Peng et.al. | 2310.07697v1 | link |
2023-10-11 | Mini-DALLE3: Interactive Text to Image by Prompting Large Language Models | Lai Zeqiang et.al. | 2310.07653v1 | link |
2023-10-11 | Flux gradient relations and their dependence on turbulence anisotropy | Samuele Mosso et.al. | 2310.07503v1 | null |
2023-10-11 | Boosting Black-box Attack to Deep Neural Networks with Conditional Diffusion Models | Renyang Liu et.al. | 2310.07492v1 | null |
2023-10-11 | Multi-Concept T2I-Zero: Tweaking Only The Text Embeddings and Nothing Else | Hazarapet Tunanyan et.al. | 2310.07419v1 | null |
2023-10-10 | What Does Stable Diffusion Know about the 3D Scene? | Guanqi Zhan et.al. | 2310.06836v1 | link |
2023-10-10 | Impact of grain boundary and surface diffusion on predicted fission gas bubble behavior and release in UO $_2$ fuel | Md Ali Muntaha et.al. | 2310.06795v1 | null |
2023-10-10 | HiFi-123: Towards High-fidelity One Image to 3D Content Generation | Wangbo Yu et.al. | 2310.06744v1 | null |
2023-10-10 | Latent Diffusion Counterfactual Explanations | Karim Farid et.al. | 2310.06668v1 | null |
2023-10-10 | Tertiary Lymphoid Structures Generation through Graph-based Diffusion | Manuel Madeira et.al. | 2310.06661v1 | null |
2023-10-09 | FLATTEN: optical FLow-guided ATTENtion for consistent text-to-video editing | Yuren Cong et.al. | 2310.05922v1 | null |
2023-10-10 | Geom-Erasing: Geometry-Driven Removal of Implicit Concept in Diffusion Models | Zhili Liu et.al. | 2310.05873v2 | null |
2023-10-09 | A Bias-Variance-Covariance Decomposition of Kernel Scores for Generative Models | Sebastian G. Gruber et.al. | 2310.05833v1 | null |
2023-10-09 | DiffuSeq-v2: Bridging Discrete and Continuous Text Spaces for Accelerated Seq2Seq Diffusion Models | Shansan Gong et.al. | 2310.05793v1 | link |
2023-10-09 | Language Model Beats Diffusion – Tokenizer is Key to Visual Generation | Lijun Yu et.al. | 2310.05737v1 | link |
2023-10-09 | CIFAR-10-Warehouse: Broad and More Realistic Testbeds in Model Generalization Analysis | Xiaoxiao Sun et.al. | 2310.04414v2 | null |
2023-10-06 | Latent Consistency Models: Synthesizing High-Resolution Images with Few-Step Inference | Simian Luo et.al. | 2310.04378v1 | link |
2023-10-05 | Aligning Text-to-Image Diffusion Models with Reward Backpropagation | Mihir Prabhudesai et.al. | 2310.03739v1 | link |
2023-10-05 | Stochastic interpolants with data-dependent couplings | Michael S. Albergo et.al. | 2310.03725v1 | null |
2023-10-05 | Ctrl-Room: Controllable Text-to-3D Room Meshes Generation with Layout Constraints | Chuan Fang et.al. | 2310.03602v1 | null |
2023-10-05 | Kandinsky: an Improved Text-to-Image Synthesis with Image Prior and Latent Diffusion | Anton Razzhigaev et.al. | 2310.03502v1 | link |
2023-10-05 | Deep Generative Models of Music Expectation | Ninon Lizé Masclef et.al. | 2310.03500v1 | null |
2023-10-05 | An Extended Phase Graph-based framework for DANTE-SPACE simulations including physiological, temporal, and spatial variations | Matthijs H. S. de Buck et.al. | 2310.03429v1 | null |
2023-10-04 | Consistent-1-to-3: Consistent Image to 3D View Synthesis via Geometry-aware Diffusion Models | Jianglong Ye et.al. | 2310.03020v1 | null |
2023-10-04 | Efficient-3DiM: Learning a Generalizable Single-image Novel-view Synthesizer in One Day | Yifan Jiang et.al. | 2310.03015v1 | null |
2023-10-04 | Probing Intersectional Biases in Vision-Language Models with Counterfactual Examples | Phillip Howard et.al. | 2310.02988v1 | null |
2023-10-04 | T $^3$ Bench: Benchmarking Current Progress in Text-to-3D Generation | Yuze He et.al. | 2310.02977v1 | link |
2023-10-04 | Fast, Expressive SE $(n)$ Equivariant Networks through Weight-Sharing in Position-Orientation Space | Erik J Bekkers et.al. | 2310.02970v1 | link |
2023-10-04 | Optimal Transport with Adaptive Regularisation | Hugues Van Assel et.al. | 2310.02925v1 | null |
2023-10-04 | Boosting Dermatoscopic Lesion Segmentation via Diffusion Models with Visual and Textual Prompts | Shiyi Du et.al. | 2310.02906v1 | null |
2023-10-03 | Hierarchical Generation of Human-Object Interactions with Diffusion Probabilistic Models | Huaijin Pi et.al. | 2310.02242v1 | null |
2023-10-03 | Leveraging Diffusion Disentangled Representations to Mitigate Shortcuts in Underspecified Visual Tasks | Luca Scimeca et.al. | 2310.02230v1 | null |
2023-10-03 | Global Attractor for a Reaction-Diffusion Model Arising in Biological Dynamic in 3D Soil Structure | Mohamed Elghandouri et.al. | 2310.02060v1 | null |
2023-10-03 | AlignDiff: Aligning Diverse Human Preferences via Behavior-Customisable Diffusion Model | Zibin Dong et.al. | 2310.02054v1 | null |
2023-10-03 | Spectral operator learning for parametric PDEs without data reliance | Junho Choi et.al. | 2310.02013v1 | null |
2023-10-03 | Optimizing microlens arrays for incoherent HiLo microscopy | Ziao Jiao et.al. | 2310.01939v1 | null |
2023-10-02 | LLM-grounded Video Diffusion Models | Long Lian et.al. | 2309.17444v2 | null |
2023-09-29 | Directly Fine-Tuning Diffusion Models on Differentiable Rewards | Kevin Clark et.al. | 2309.17400v1 | null |
2023-09-29 | Physics-Informed Neural Network for the Transient Diffusivity Equation in Reservoir Engineering | Daniel Badawi et.al. | 2309.17345v1 | null |
2023-09-28 | KV Inversion: KV Embeddings Learning for Text-Conditioned Real Image Action Editing | Jiancheng Huang et.al. | 2309.16608v1 | null |
2023-09-28 | CCEdit: Creative and Controllable Video Editing via Diffusion Models | Ruoyu Feng et.al. | 2309.16496v1 | null |
2023-09-28 | Distilling ODE Solvers of Diffusion Models into Smaller Steps | Sanghwan Kim et.al. | 2309.16421v1 | null |
2023-09-27 | Exploiting the Signal-Leak Bias in Diffusion Models | Martin Nicolas Everaert et.al. | 2309.15842v1 | null |
2023-09-27 | Show-1: Marrying Pixel and Latent Diffusion Models for Text-to-Video Generation | David Junhao Zhang et.al. | 2309.15818v1 | link |
2023-09-27 | Emu: Enhancing Image Generation Models Using Photogenic Needles in a Haystack | Xiaoliang Dai et.al. | 2309.15807v1 | null |
2023-09-27 | Factorized Diffusion Architectures for Unsupervised Image Generation and Segmentation | Xin Yuan et.al. | 2309.15726v1 | null |
2023-09-27 | Dynamic Prompt Learning: Addressing Cross-Attention Leakage for Text-Based Image Editing | Kai Wang et.al. | 2309.15664v1 | link |
2023-09-27 | Direct Sensing of Remote Nuclei: Expanding the Reach of Cross-Effect Dynamic Nuclear Polarization | Amaria Javed et.al. | 2309.15653v1 | null |
2023-09-26 | Generating Visual Scenes from Touch | Fengyu Yang et.al. | 2309.15117v1 | null |
2023-09-27 | LAVIE: High-Quality Video Generation with Cascaded Latent Diffusion Models | Yaohui Wang et.al. | 2309.15103v2 | link |
2023-09-26 | FEC: Three Finetuning-free Methods to Enhance Consistency for Real Image Editing | Songyan Chen et.al. | 2309.14934v1 | null |
2023-09-27 | ITEM3D: Illumination-Aware Directional Texture Editing for 3D Models | Shengqi Liu et.al. | 2309.14872v2 | null |
2023-09-26 | Navigating Text-To-Image Customization:From LyCORIS Fine-Tuning to Model Evaluation | Shin-Ying Yeh et.al. | 2309.14859v1 | link |
2023-09-25 | Dataset Diffusion: Diffusion-based Synthetic Dataset Generation for Pixel-Level Semantic Segmentation | Quang Nguyen et.al. | 2309.14303v1 | link |
2023-09-25 | Soft Mixture Denoising: Beyond the Expressive Bottleneck of Diffusion Models | Yangming Li et.al. | 2309.14068v1 | null |
2023-09-25 | Mixing as a correlated aggregation process | Joris Heyman et.al. | 2309.14040v1 | link |
2023-09-22 | MosaicFusion: Diffusion Models as Data Augmenters for Large Vocabulary Instance Segmentation | Jiahao Xie et.al. | 2309.13042v1 | link |
2023-09-22 | Diffusion Augmentation for Sequential Recommendation | Qidong Liu et.al. | 2309.12858v1 | link |
2023-09-22 | Accuracy and stability analysis of horizontal discretizations used in unstructured grid ocean models | Fabricio Rodrigues Lapolli et.al. | 2309.12832v1 | null |
2023-09-22 | Synthetic Boost: Leveraging Synthetic Data for Enhanced Vision-Language Segmentation in Echocardiography | Rabin Adhikari et.al. | 2309.12829v1 | link |
2023-09-22 | Semantic Change Driven Generative Semantic Communication Framework | Wanting Yang et.al. | 2309.12775v1 | link |
2023-09-21 | A Diffusion-Model of Joint Interactive Navigation | Matthew Niedoba et.al. | 2309.12508v1 | null |
2023-09-21 | License Plate Super-Resolution Using Diffusion Models | Sawsan AlHalawani et.al. | 2309.12506v1 | null |
2023-09-21 | Performance Conditioning for Diffusion-Based Multi-Instrument Music Synthesis | Ben Maman et.al. | 2309.12283v1 | null |
2023-09-20 | FreeU: Free Lunch in Diffusion U-Net | Chenyang Si et.al. | 2309.11497v1 | link |
2023-09-20 | Generative Agent-Based Modeling: Unveiling Social System Dynamics through Coupling Mechanistic Models with Generative Artificial Intelligence | Navid Ghaffarzadegan et.al. | 2309.11456v1 | null |
2023-09-20 | Deep Networks as Denoising Algorithms: Sample-Efficient Learning of Diffusion Models in High-Dimensional Graphical Models | Song Mei et.al. | 2309.11420v1 | null |
2023-09-20 | EDMP: Ensemble-of-costs-guided Diffusion for Motion Planning | Kallol Saha et.al. | 2309.11414v1 | link |
2023-09-20 | Face Aging via Diffusion-based Editing | Xiangyi Chen et.al. | 2309.11321v1 | link |
2023-09-20 | FaceDiffuser: Speech-Driven 3D Facial Animation Synthesis Using Diffusion | Stefan Stan et.al. | 2309.11306v1 | link |
2023-09-19 | PGDiff: Guiding Diffusion Models for Versatile Face Restoration via Partial Guidance | Peiqing Yang et.al. | 2309.10810v1 | link |
2023-09-19 | Accelerating Diffusion-Based Text-to-Audio Generation with Consistency Distillation | Yatong Bai et.al. | 2309.10740v1 | link |
2023-09-19 | Reconstruct-and-Generate Diffusion Model for Detail-Preserving Image Denoising | Yujin Wang et.al. | 2309.10714v1 | null |
2023-09-18 | Generating and Imputing Tabular Data via Diffusion and Flow-based Gradient-Boosted Trees | Alexia Jolicoeur-Martineau et.al. | 2309.09968v1 | link |
2023-09-18 | What is a Fair Diffusion Model? Designing Generative Text-To-Image Models to Incorporate Various Worldviews | Zoe De Simone et.al. | 2309.09944v1 | link |
2023-09-18 | DriveDreamer: Towards Real-world-driven World Models for Autonomous Driving | Xiaofeng Wang et.al. | 2309.09777v1 | null |
2023-09-18 | Application-driven Validation of Posteriors in Inverse Problems | Tim J. Adler et.al. | 2309.09764v1 | null |
2023-09-19 | Non-Hermitian physics and topological phenomena in convective thermal metamaterials | Zhoufei Liu et.al. | 2309.09681v2 | null |
2023-09-18 | Anomalous Diffusion of Lithium-Anion Clusters in Ionic Liquids | YeongKyu Lee et.al. | 2309.09674v1 | null |
2023-09-15 | Compositional Foundation Models for Hierarchical Planning | Anurag Ajay et.al. | 2309.08587v1 | null |
2023-09-15 | Denoising Diffusion Probabilistic Models for Hardware-Impaired Communications | Mehdi Letafati et.al. | 2309.08568v1 | null |
2023-09-15 | Breathing New Life into 3D Assets with Generative Repainting | Tianfu Wang et.al. | 2309.08523v1 | link |
2023-09-15 | Diffuse-illumination holographic optical coherence tomography | Léo Puyo et.al. | 2309.08486v1 | null |
2023-09-15 | Large-Vocabulary 3D Diffusion Model with Transformer | Ziang Cao et.al. | 2309.07920v2 | null |
2023-09-14 | Generative Image Dynamics | Zhengqi Li et.al. | 2309.07906v1 | null |
2023-09-14 | Beta Diffusion | Mingyuan Zhou et.al. | 2309.07867v1 | link |
2023-09-14 | Study and evaluation of the Ronen Method accuracy at material interfaces | Johan Cufe et.al. | 2309.07756v1 | null |
2023-09-14 | Dual-angle interferometric scattering microscopy for optical multiparametric particle characterization | Erik Olsén et.al. | 2309.07572v1 | null |
2023-09-13 | UnifiedGesture: A Unified Gesture Synthesis Model for Multiple Skeletons | Sicheng Yang et.al. | 2309.07051v1 | link |
2023-09-13 | Experimental Study on the Detection of Frozen Diffused Ammonia Blockage in the Inactive Section of a Variable Conductance Heat Pipe | F. K. Miranda et.al. | 2309.06936v1 | null |
2023-09-13 | DreamStyler: Paint by Style Inversion with Text-to-Image Diffusion Models | Namhyuk Ahn et.al. | 2309.06933v1 | null |
2023-09-13 | MagiCapture: High-Resolution Multi-Concept Portrait Customization | Junha Hyung et.al. | 2309.06895v1 | null |
2023-09-13 | DCTTS: Discrete Diffusion Model with Contrastive Learning for Text-to-speech Generation | Zhichao Wu et.al. | 2309.06787v1 | null |
2023-09-13 | High throughput sampling of phase space with deep learning potentials: $δ$ -AlOOH at geophysical conditions | Chenxing Luo et.al. | 2309.06712v1 | null |
2023-09-13 | Generalizable improvement of the Spalart-Allmaras model through assimilation of experimental data | Deepinder Jot Singh Aulakh et.al. | 2309.06679v1 | null |
2023-09-12 | InstaFlow: One Step is Enough for High-Quality Diffusion-Based Text-to-Image Generation | Xingchao Liu et.al. | 2309.06380v1 | link |
2023-09-12 | Dispersion versus diffusion in mixing fronts | Gauthier Rousseau et.al. | 2309.06347v1 | null |
2023-09-12 | Unraveling biochemical spatial patterns: machine learning approaches to the inverse problem of Turing patterns | Antonio Matas-Gil et.al. | 2309.06339v1 | link |
2023-09-12 | Fg-T2M: Fine-Grained Text-Driven Human Motion Generation via Diffusion Model | Yin Wang et.al. | 2309.06284v1 | null |
2023-09-11 | Diffusion-Guided Reconstruction of Everyday Hand-Object Interaction Clips | Yufei Ye et.al. | 2309.05663v1 | null |
2023-09-11 | PAI-Diffusion: Constructing and Serving a Family of Open Chinese Diffusion Models for Text-to-image Synthesis on the Cloud | Chengyu Wang et.al. | 2309.05534v1 | null |
2023-09-11 | NExT-GPT: Any-to-Any Multimodal LLM | Shengqiong Wu et.al. | 2309.05519v1 | link |
2023-09-08 | Variations and Relaxations of Normalizing Flows | Keegan Kelly et.al. | 2309.04433v1 | null |
2023-09-08 | Create Your World: Lifelong Text-to-Image Diffusion | Gan Sun et.al. | 2309.04430v1 | null |
2023-09-08 | MaskDiffusion: Boosting Text-to-Image Consistency with Conditional Mask | Yupeng Zhou et.al. | 2309.04399v1 | null |
2023-09-08 | MoEController: Instruction-based Arbitrary Image Manipulation with Mixture-of-Expert Controllers | Sijia Li et.al. | 2309.04372v1 | null |
2023-09-08 | The role of tumbling in bacterial scattering at convex obstacles | Theresa Jakuszeit et.al. | 2309.04326v1 | null |
2023-09-07 | InstructDiffusion: A Generalist Modeling Interface for Vision Tasks | Zigang Geng et.al. | 2309.03895v1 | null |
2023-09-07 | DiffusionEngine: Diffusion Model is Scalable Data Engine for Object Detection | Manlin Zhang et.al. | 2309.03893v1 | null |
2023-09-07 | Text-to-feature diffusion for audio-visual few-shot learning | Otniel-Bogdan Mercea et.al. | 2309.03869v1 | link |
2023-09-07 | Phasic Content Fusing Diffusion Model with Directional Distribution Consistency for Few-Shot Model Adaption | Teng Hu et.al. | 2309.03729v1 | link |
2023-09-07 | DiffDefense: Defending against Adversarial Attacks via Diffusion Models | Hondamunige Prasanna Silva et.al. | 2309.03702v1 | link |
2023-09-07 | Text2Control3D: Controllable 3D Avatar Generation in Neural Radiance Fields using Geometry-Guided Text-to-Image Diffusion Model | Sungwon Hwang et.al. | 2309.03550v1 | null |
2023-09-06 | My Art My Choice: Adversarial Protection Against Unruly AI | Anthony Rhodes et.al. | 2309.03198v1 | null |
2023-09-06 | SLiMe: Segment Like Me | Aliasghar Khani et.al. | 2309.03179v1 | link |
2023-09-06 | MCM: Multi-condition Motion Synthesis Framework for Multi-scenario | Zeyu Ling et.al. | 2309.03031v1 | null |
2023-09-05 | Generating Realistic Images from In-the-wild Sounds | Taegyeong Lee et.al. | 2309.02405v1 | null |
2023-09-05 | A Diffusion Quantum Monte Carlo Approach to the Polaritonic Ground State | Braden M. Weight et.al. | 2309.02349v1 | link |
2023-09-05 | Robust frequency-dependent diffusion kurtosis computation using an efficient direction scheme, axisymmetric modelling, and spatial regularization | J. Hamilton et.al. | 2309.02319v1 | null |
2023-09-05 | Neuromorphic nanocluster networks: Critical role of the substrate in nano-link formation | Wenkai Wu et.al. | 2309.02299v1 | null |
2023-09-05 | Robustness and Generalizability of Deepfake Detection: A Study with Diffusion Models | Haixu Song et.al. | 2309.02218v1 | link |
2023-09-01 | Iterative Multi-granular Image Editing using Diffusion Models | K J Joseph et.al. | 2309.00613v1 | null |
2023-09-01 | VideoGen: A Reference-Guided Latent Diffusion Approach for High Definition Text-to-Video Generation | Xin Li et.al. | 2309.00398v1 | null |
2023-09-01 | Fast Diffusion EM: a diffusion model for blind inverse problems with application to deconvolution | Charles Laroche et.al. | 2309.00287v1 | link |
2023-09-01 | Data-driven Topology Optimization of Channel Flow Problems | Ce Guan et.al. | 2309.00278v1 | null |
2023-08-31 | InterDiff: Generating 3D Human-Object Interactions with Physics-Informed Diffusion | Sirui Xu et.al. | 2308.16905v1 | link |
2023-09-01 | GNFactor: Multi-Task Real Robot Learning with Generalizable Neural Feature Fields | Yanjie Ze et.al. | 2308.16891v2 | link |
2023-08-31 | Prediction of Diblock Copolymer Morphology via Machine Learning | Hyun Park et.al. | 2308.16886v1 | null |
2023-08-31 | Diffusion Models for Interferometric Satellite Aperture Radar | Alexandre Tuel et.al. | 2308.16847v1 | link |
2023-09-01 | Irregular Traffic Time Series Forecasting Based on Asynchronous Spatio-Temporal Graph Convolutional Network | Weijia Zhang et.al. | 2308.16818v2 | null |
2023-09-01 | Ref-Diff: Zero-shot Referring Image Segmentation with Generative Models | Minheng Ni et.al. | 2308.16777v2 | null |
2023-08-31 | Terrain Diffusion Network: Climatic-Aware Terrain Generation with Geological Sketch Guidance | Zexin Hu et.al. | 2308.16725v1 | null |
2023-08-30 | SignDiff: Learning Diffusion Models for American Sign Language Production | Sen Fang et.al. | 2308.16082v1 | null |
2023-08-30 | Click Metamaterials: Fast Acquisition of Thermal Conductivity and Functionality Diversities | Chengmeng Wang et.al. | 2308.16057v1 | null |
2023-08-30 | DiffuVolume: Diffusion Model for Volume based Stereo Matching | Dian Zheng et.al. | 2308.15989v1 | null |
2023-08-30 | Physics-Informed DeepMRI: Bridging the Gap from Heat Diffusion to k-Space Interpolation | Zhuo-Xu Cui et.al. | 2308.15918v1 | null |
2023-08-29 | ParaGuide: Guided Diffusion Paraphrasers for Plug-and-Play Textual Style Transfer | Zachary Horvitz et.al. | 2308.15459v1 | link |
2023-08-29 | Vortex core radius in baroclinic turbulence: Implications for scaling predictions | Gabriel Hadjerci et.al. | 2308.15398v1 | null |
2023-08-29 | Rayleigh-Bénard instability in a horizontal porous layer with anomalous diffusion | Antonio Barletta et.al. | 2308.15359v1 | null |
2023-08-30 | Elucidating the Exposure Bias in Diffusion Models | Mang Ning et.al. | 2308.15321v2 | link |
2023-08-28 | Total Selfie: Generating Full-Body Selfies | Bowei Chen et.al. | 2308.14740v1 | null |
2023-08-28 | Oscillating reaction in porous media under saddle flow | Satoshi Izumoto et.al. | 2308.14723v1 | null |
2023-08-28 | 360-Degree Panorama Generation from Few Unregistered NFoV Images | Jionghao Wang et.al. | 2308.14686v1 | link |
2023-08-28 | Effect of gas diffusion layer fiber shape on cathode two-phase dynamics in proton exchange membrane fuel cells | Danan Yang et.al. | 2308.14539v1 | null |
2023-08-28 | Priority-Centric Human Motion Generation in Discrete Latent Space | Hanyang Kong et.al. | 2308.14480v1 | null |
2023-08-25 | Distribution-Aligned Diffusion for Human Mesh Recovery | Lin Geng Foo et.al. | 2308.13369v1 | null |
2023-08-25 | Age of Information Diffusion on Social Networks: Optimizing Multi-Stage Seeding Strategies | Songhua Li et.al. | 2308.13303v1 | null |
2023-08-25 | EfficientDreamer: High-Fidelity and Robust 3D Creation via Orthogonal-view Diffusion Prior | Minda Zhao et.al. | 2308.13223v1 | link |
2023-08-25 | Diff-Retinex: Rethinking Low-light Image Enhancement with A Generative Diffusion Model | Xunpeng Yi et.al. | 2308.13164v1 | null |
2023-08-25 | A Survey of Diffusion Based Image Generation Models: Issues and Their Solutions | Tianyi Zhang et.al. | 2308.13142v1 | null |
2023-08-24 | Dense Text-to-Image Generation with Attention Modulation | Yunji Kim et.al. | 2308.12964v1 | link |
2023-08-24 | Language as Reality: A Co-Creative Storytelling Game Experience in 1001 Nights using Generative AI | Yuqian Sun et.al. | 2308.12915v1 | null |
2023-08-24 | Hydrogen jet diffusion modeling by using physics-informed graph neural network and sparsely-distributed sensor data | Xinqi Zhang et.al. | 2308.12621v1 | null |
2023-08-24 | APLA: Additional Perturbation for Latent Noise with Adversarial Training Enables Consistency | Yupu Yao et.al. | 2308.12605v1 | null |
2023-08-23 | On-Manifold Projected Gradient Descent | Aaron Mahler et.al. | 2308.12279v1 | null |
2023-08-23 | Diffusion Language Models Can Perform Many Tasks with Scaling and Instruction-Finetuning | Jiasheng Ye et.al. | 2308.12219v1 | link |
2023-08-23 | Pulse shape discrimination for the CONUS experiment in the keV and sub-keV regime | H. Bonet et.al. | 2308.12105v1 | null |
2023-08-22 | Theory of Transverse Mode Instability in Fiber Amplifiers with Multimode Excitations | Kabish Wisal et.al. | 2308.11599v1 | null |
2023-08-22 | NIPG-DG schemes for transformed master equations modeling open quantum systems | Jose A. Morales Escalante et.al. | 2308.11580v1 | null |
2023-08-22 | IT3D: Improved Text-to-3D Generation with Explicit View Synthesis | Yiwen Chen et.al. | 2308.11473v1 | link |
2023-08-22 | SDeMorph: Towards Better Facial De-morphing from Single Morph | Nitish Shukla et.al. | 2308.11442v1 | null |
2023-08-21 | TADA! Text to Animatable Digital Avatars | Tingting Liao et.al. | 2308.10899v1 | null |
2023-08-21 | Election Manipulation in Social Networks with Single-Peaked Agents | Vincenzo Auletta et.al. | 2308.10845v1 | null |
2023-08-21 | Backdooring Textual Inversion for Concept Censorship | Yutong wu et.al. | 2308.10718v1 | null |
2023-08-21 | EVE: Efficient zero-shot text-based Video Editing with Depth Map Guidance and Temporal Consistency Constraints | Yutao Chen et.al. | 2308.10648v1 | null |
2023-08-21 | Frequency Compensated Diffusion Model for Real-scene Dehazing | Jing Wang et.al. | 2308.10510v1 | link |
2023-08-21 | Texture Generation on 3D Meshes with Point-UV Diffusion | Xin Yu et.al. | 2308.10490v1 | null |
2023-08-18 | Diff2Lip: Audio Conditioned Diffusion Models for Lip-Synchronization | Soumik Mukhopadhyay et.al. | 2308.09716v1 | link |
2023-08-18 | HumanLiff: Layer-wise 3D Human Generation with Diffusion Model | Shoukang Hu et.al. | 2308.09712v1 | null |
2023-08-18 | SimDA: Simple Diffusion Adapter for Efficient Video Generation | Zhen Xing et.al. | 2308.09710v1 | null |
2023-08-18 | Guide3D: Create 3D Avatars from Text and Image Guidance | Yukang Cao et.al. | 2308.09705v1 | null |
2023-08-18 | PoSynDA: Multi-Hypothesis Pose Synthesis Domain Adaptation for Robust 3D Human Pose Estimation | Hanbing Liu et.al. | 2308.09678v1 | link |
2023-08-18 | Constrained Bayesian Optimization Using a Lagrange Multiplier Applied to Power Transistor Design | Ping-Ju Chuang et.al. | 2308.09612v1 | null |
2023-08-18 | Language-Guided Diffusion Model for Visual Grounding | Sijia Chen et.al. | 2308.09599v1 | null |
2023-08-18 | StableVideo: Text-driven Consistency-aware Diffusion Video Editing | Wenhao Chai et.al. | 2308.09592v1 | link |
2023-08-18 | O^2-Recon: Completing 3D Reconstruction of Occluded Objects in the Scene with a Pre-trained 2D Diffusion Model | Yubin Hu et.al. | 2308.09591v1 | link |
2023-08-16 | TeCH: Text-guided Reconstruction of Lifelike Clothed Humans | Yangyi Huang et.al. | 2308.08545v1 | link |
2023-08-16 | Voxlines: Streamline Transparency through Voxelization and View-Dependent Line Orders | Besm Osman et.al. | 2308.08436v1 | null |
2023-08-16 | Diff-CAPTCHA: An Image-based CAPTCHA with Security Enhanced by Denoising Diffusion Model | Ran Jiang et.al. | 2308.08367v1 | null |
2023-08-18 | Dual-Stream Diffusion Net for Text-to-Video Generation | Binhui Liu et.al. | 2308.08316v2 | null |
2023-08-16 | Electron transfer efficiency in liquid xenon across THGEM holes | G. Martínez-Lema et.al. | 2308.08314v1 | null |
2023-08-15 | StyleDiffusion: Controllable Disentangled Style Transfer via Diffusion Models | Zhizhong Wang et.al. | 2308.07863v1 | null |
2023-08-15 | CCD-3DR: Consistent Conditioning in Diffusion for Single-Image 3D Reconstruction | Yan Di et.al. | 2308.07837v1 | null |
2023-08-15 | DiffV2S: Diffusion-based Video-to-Speech Synthesis with Vision-guided Speaker Embedding | Jeongsoo Choi et.al. | 2308.07787v1 | link |
2023-08-15 | Dancing Avatar: Pose and Text-Guided Human Motion Videos Synthesis with Image Diffusion Model | Bosheng Qin et.al. | 2308.07749v1 | null |
2023-08-14 | Jurassic World Remake: Bringing Ancient Fossils Back to Life via Zero-Shot Long Image-to-Image Translation | Alexander Martin et.al. | 2308.07316v1 | link |
2023-08-14 | DiffSED: Sound Event Detection with Denoising Diffusion | Swapnil Bhosale et.al. | 2308.07293v1 | null |
2023-08-14 | Diffusion Based Augmentation for Captioning and Retrieval in Cultural Heritage | Dario Cioni et.al. | 2308.07151v1 | link |
2023-08-14 | Temporal clustering of social interactions trades-off disease spreading and knowledge diffusion | Giulia Cencetti et.al. | 2308.07058v1 | link |
2023-08-14 | Bayesian Flow Networks | Alex Graves et.al. | 2308.07037v1 | link |
2023-08-14 | An efficient topology optimization method for steady gas flows in all flow regimes | Ruifeng Yuan et.al. | 2308.07018v1 | null |
2023-08-14 | Discrete Conditional Diffusion for Reranking in Recommendation | Xiao Lin et.al. | 2308.06982v1 | null |
2023-08-11 | Acoustofluidic Engineering Functional Vessel-on-a-Chip | Yue Wu et.al. | 2308.06219v1 | null |
2023-08-11 | DatasetDM: Synthesizing Data with Perception Annotations Using Diffusion Models | Weijia Wu et.al. | 2308.06160v1 | link |
2023-08-11 | Taming the Power of Diffusion Models for High-Quality Virtual Try-On with Appearance Flow | Junhong Gou et.al. | 2308.06101v1 | link |
2023-08-11 | Diffusion-based Visual Counterfactual Explanations – Towards Systematic Quantitative Evaluation | Philipp Vaeth et.al. | 2308.06100v1 | link |
2023-08-11 | Head Rotation in Denoising Diffusion Models | Andrea Asperti et.al. | 2308.06057v1 | link |
2023-08-11 | Diverse Data Augmentation with Diffusions for Effective Test-time Prompt Tuning | Chun-Mei Feng et.al. | 2308.06038v1 | link |
2023-08-11 | Masked-Attention Diffusion Guidance for Spatially Controlling Text-to-Image Generation | Yuki Endo et.al. | 2308.06027v1 | link |
2023-08-14 | Audio is all in one: speech-driven gesture synthetics using WavLM pre-trained model | Fan Zhang et.al. | 2308.05995v2 | null |
2023-08-10 | AudioLDM 2: Learning Holistic Audio Generation with Self-supervised Pretraining | Haohe Liu et.al. | 2308.05734v1 | link |
2023-08-10 | PDE-Refiner: Achieving Accurate Long Rollouts with Neural PDE Solvers | Phillip Lippe et.al. | 2308.05732v1 | null |
2023-08-10 | Masked Diffusion as Self-supervised Representation Learner | Zixuan Pan et.al. | 2308.05695v1 | null |
2023-08-10 | Generative Diffusion Models for Radio Wireless Channel Modelling and Sampling | Ushnish Sengupta et.al. | 2308.05583v1 | null |
2023-08-10 | Fokker-Planck-Poisson kinetics: Multi-phase flow beyond equilibrium | Mohsen Sadr et.al. | 2308.05580v1 | null |
2023-08-09 | LayoutLLM-T2I: Eliciting Layout Guidance from LLM for Text-to-Image Generation | Leigang Qu et.al. | 2308.05095v1 | null |
2023-08-09 | Do Diffusion Models Suffer Error Propagation? Theoretical Analysis and Consistency Regularization | Yangming Li et.al. | 2308.05021v1 | null |
2023-08-10 | IDiff-Face: Synthetic-based Face Recognition through Fizzy Identity-Conditioned Diffusion Models | Fadi Boutros et.al. | 2308.04995v2 | link |
2023-08-09 | CasCIFF: A Cross-Domain Information Fusion Framework Tailored for Cascade Prediction in Social Networks | Hongjun Zhu et.al. | 2308.04961v1 | link |
2023-08-09 | Interaction-induced directional transport on periodically driven chains | Helena Drüeke et.al. | 2308.04845v1 | null |
2023-08-08 | DiffCR: A Fast Conditional Diffusion Framework for Cloud Removal from Optical Satellite Images | Xuechao Zou et.al. | 2308.04417v1 | link |
2023-08-08 | Cloth2Tex: A Customized Cloth Texture Generation Pipeline for 3D Virtual Try-On | Daiheng Gao et.al. | 2308.04288v1 | null |
2023-08-08 | MCDAN: a Multi-scale Context-enhanced Dynamic Attention Network for Diffusion Prediction | Xiaowen Wang et.al. | 2308.04266v1 | null |
2023-08-08 | FLIRT: Feedback Loop In-context Red Teaming | Ninareh Mehrabi et.al. | 2308.04265v1 | null |
2023-08-08 | MindDiffuser: Controlled Image Reconstruction from Human Brain Activity with Semantic and Structural Diffusion | Yizhuo Lu et.al. | 2308.04249v1 | link |
2023-08-08 | Synthetic Augmentation with Large-scale Unconditional Pre-training | Jiarong Ye et.al. | 2308.04020v1 | link |
2023-08-07 | Diffusion Model in Causal Inference with Unmeasured Confounders | Tatsuhiro Shimizu et.al. | 2308.03669v1 | link |
2023-08-07 | AvatarVerse: High-quality & Stable 3D Avatar Creation from Text and Pose | Huichao Zhang et.al. | 2308.03610v1 | link |
2023-08-08 | DiffSynth: Latent In-Iteration Deflickering for Realistic Video Synthesis | Zhongjie Duan et.al. | 2308.03463v2 | link |
2023-08-04 | Quantum Dynamical Approach to Predicting the Optical Pumping Threshold for Lasing in Organic Materials | Bin Zhang et.al. | 2308.02447v1 | null |
2023-08-04 | Diffusion-Augmented Depth Prediction with Sparse Annotations | Jiaqi Li et.al. | 2308.02283v1 | null |
2023-08-04 | Painterly Image Harmonization using Diffusion Model | Lingxiao Lu et.al. | 2308.02228v1 | link |
2023-08-03 | Synthesizing Long-Term Human Motions with Diffusion Models via Coherent Sampling | Zhao Yang et.al. | 2308.01850v1 | link |
2023-08-03 | DiffColor: Toward High Fidelity Text-Guided Image Colorization with Diffusion Models | Jianxin Lin et.al. | 2308.01655v1 | null |
2023-08-03 | Reference-Free Isotropic 3D EM Reconstruction using Diffusion Models | Kyungryun Lee et.al. | 2308.01594v1 | null |
2023-08-03 | Adversarial Training of Denoising Diffusion Model Using Dual Discriminators for High-Fidelity Multi-Speaker TTS | Myeongjin Ko et.al. | 2308.01573v1 | link |
2023-08-02 | Patched Denoising Diffusion Models For High-Resolution Image Synthesis | Zheng Ding et.al. | 2308.01316v1 | link |
2023-08-02 | Contrast-augmented Diffusion Model with Fine-grained Sequence Alignment for Markup-to-Image Generation | Guojin Zhong et.al. | 2308.01147v1 | link |
2023-08-02 | DiffusePast: Diffusion-based Generative Replay for Class Incremental Semantic Segmentation | Jingfan Chen et.al. | 2308.01127v1 | null |
2023-08-01 | Tool Documentation Enables Zero-Shot Tool-Usage with Large Language Models | Cheng-Yu Hsieh et.al. | 2308.00675v1 | null |
2023-08-01 | Diffusion Model for Camouflaged Object Detection | Zhennan Chen et.al. | 2308.00303v1 | null |
2023-07-31 | Universal Adversarial Defense in Remote Sensing Based on Pre-trained Denoising Diffusion Models | Weikang Yu et.al. | 2307.16865v1 | null |
2023-07-31 | From Generation to Suppression: Towards Effective Irregular Glow Removal for Nighttime Visibility Enhancement | Wanyu Wu et.al. | 2307.16783v1 | null |
2023-07-31 | Understanding Dynamics in Coarse-Grained Models: III. Roles of Rotational Motion and Translation-Rotation Coupling in Coarse-Grained Dynamics | Jaehyeok Jin et.al. | 2307.16747v1 | null |
2023-07-31 | DiffPose: SpatioTemporal Diffusion Model for Video-Based Human Pose Estimation | Runyang Feng et.al. | 2307.16687v1 | null |
2023-07-31 | On the Trustworthiness Landscape of State-of-the-art Generative Models: A Comprehensive Survey | Mingyuan Fan et.al. | 2307.16680v1 | null |
2023-07-28 | Understanding the Anomalous Diffusion of Water in Aqueous Electrolytes Using Machine Learned Potentials | Nikhil V. S. Avula et.al. | 2307.15576v1 | null |
2023-07-28 | Minimally-Supervised Speech Synthesis with Conditional Diffusion Model and Language Model: A Comparative Study of Semantic Coding | Chunyu Qiang et.al. | 2307.15484v1 | null |
2023-07-27 | The RoboDepth Challenge: Methods and Advancements Towards Robust Depth Estimation | Lingdong Kong et.al. | 2307.15061v1 | link |
2023-07-27 | TEDi: Temporally-Entangled Diffusion for Long-Term Motion Synthesis | Zihan Zhang et.al. | 2307.15042v1 | null |
2023-07-27 | Generative convective parametrization of dry atmospheric boundary layer | Florian Heyder et.al. | 2307.14857v1 | null |
2023-07-27 | Empirical analysis of congestion spreading in Seoul traffic network | Jung-Hoon Jung et.al. | 2307.14800v1 | null |
2023-07-26 | Virtual Mirrors: Non-Line-of-Sight Imaging Beyond the Third Bounce | Diego Royo et.al. | 2307.14341v1 | null |
2023-07-26 | Visual Instruction Inversion: Image Editing via Visual Prompting | Thao Nguyen et.al. | 2307.14331v1 | link |
2023-07-26 | Founding a mathematical diffusion model in linguistics. The case study of German syntactic features in the North-Eastern Italian dialects | I. Lazzizzera et.al. | 2307.14291v1 | null |
2023-07-26 | VideoControlNet: A Motion-Guided Video-to-Video Translation Framework by Using Diffusion Model with ControlNet | Zhihao Hu et.al. | 2307.14073v1 | null |
2023-07-25 | Comparing phase-space and phenomenological modeling approaches for Lagrangian particles settling in a turbulent boundary layer | Andrew P. Grace et.al. | 2307.13659v1 | null |
2023-07-25 | Fake It Without Making It: Conditioned Face Generation for Accurate 3D Face Shape Estimation | Will Rowan et.al. | 2307.13639v1 | null |
2023-07-25 | XDLM: Cross-lingual Diffusion Language Model for Machine Translation | Linyao Chen et.al. | 2307.13560v1 | null |
2023-07-25 | Not with my name! Inferring artists’ names of input strings employed by Diffusion Models | Roberto Leotta et.al. | 2307.13527v1 | link |
2023-07-24 | A Systematic Survey of Prompt Engineering on Vision-Language Foundation Models | Jindong Gu et.al. | 2307.12980v1 | link |
2023-07-24 | Data-free Black-box Attack based on Diffusion Model | Mingwen Shao et.al. | 2307.12872v1 | null |
2023-07-24 | Understanding the Latent Space of Diffusion Models through the Lens of Riemannian Geometry | Yong-Hyun Park et.al. | 2307.12868v1 | link |
2023-07-24 | The ro-vibrational $ν_2$ mode spectrum of methane investigated by ultrabroadband coherent Raman spectroscopy | Francesco Mazza et.al. | 2307.12740v1 | null |
2023-07-21 | FEDD – Fair, Efficient, and Diverse Diffusion-based Lesion Segmentation and Malignancy Classification | Héctor Carrión et.al. | 2307.11654v1 | link |
2023-07-21 | Mixbiotic society measures: Assessment of community well-going as living system | Takeshi Kato et.al. | 2307.11594v1 | null |
2023-07-21 | Predict, Refine, Synthesize: Self-Guiding Diffusion Models for Probabilistic Time Series Forecasting | Marcel Kollovieh et.al. | 2307.11494v1 | link |
2023-07-20 | Hypergraph Diffusions and Resolvents for Norm-Based Hypergraph Laplacians | Konstantinos Ameranis et.al. | 2307.11042v1 | null |
2023-07-20 | Progressive distillation diffusion for raw music generation | Svetlana Pavlova et.al. | 2307.10994v1 | null |
2023-07-20 | Energy-consistent discretization of viscous dissipation with application to natural convection flow | Benjamin Sanderse et.al. | 2307.10874v1 | null |
2023-07-19 | FABRIC: Personalizing Diffusion Models with Iterative Feedback | Dimitri von Rütte et.al. | 2307.10159v1 | link |
2023-07-19 | XSkill: Cross Embodiment Skill Discovery | Mengda Xu et.al. | 2307.09955v1 | link |
2023-07-19 | Visual Representation for Patterned Proliferation of Social Media Addiction: Quantitative Model and Network Analysis | Dibyajyoti Mallick et.al. | 2307.09902v1 | null |
2023-07-19 | BSDM: Background Suppression Diffusion Model for Hyperspectral Anomaly Detection | Jitao Ma et.al. | 2307.09861v1 | link |
2023-07-19 | A Siamese-based Verification System for Open-set Architecture Attribution of Synthetic Images | Lydia Abady et.al. | 2307.09822v1 | link |
2023-07-18 | AnyDoor: Zero-shot Object-level Image Customization | Xi Chen et.al. | 2307.09481v1 | link |
2023-07-18 | Augmenting CLIP with Improved Visio-Linguistic Reasoning | Samyadeep Basu et.al. | 2307.09233v1 | null |
2023-07-17 | Diffusion Models Beat GANs on Image Classification | Soumik Mukhopadhyay et.al. | 2307.08702v1 | null |
2023-07-17 | Flow Matching in Latent Space | Quan Dao et.al. | 2307.08698v1 | link |
2023-07-17 | SEMI-DiffusionInst: A Diffusion Model Based Approach for Semiconductor Defect Classification and Segmentation | Vic De Ridder et.al. | 2307.08693v1 | null |
2023-07-17 | Multimodal Diffusion Segmentation Model for Object Segmentation from Manipulation Instructions | Yui Iioka et.al. | 2307.08597v1 | null |
2023-07-17 | Identity-Preserving Aging of Face Images via Latent Diffusion Models | Sudipta Banerjee et.al. | 2307.08585v1 | link |
2023-07-17 | Synthetic Lagrangian Turbulence by Generative Diffusion Models | Tianyi Li et.al. | 2307.08529v1 | link |
2023-07-17 | How far does turbulence spread? | Alexandros Alexakis et.al. | 2307.08469v1 | null |
2023-07-17 | Not All Steps are Created Equal: Selective Diffusion Distillation for Image Manipulation | Luozhou Wang et.al. | 2307.08448v1 | link |
2023-07-18 | Unstoppable Attack: Label-Only Model Inversion via Conditional Diffusion Model | Rongke Liu et.al. | 2307.08424v2 | null |
2023-07-14 | NIFTY: Neural Object Interaction Fields for Guided Human Motion Synthesis | Nilesh Kulkarni et.al. | 2307.07511v1 | null |
2023-07-14 | DreamTeacher: Pretraining Image Backbones with Deep Generative Models | Daiqing Li et.al. | 2307.07487v1 | null |
2023-07-14 | Inverse Evolution Layers: Physics-informed Regularizers for Deep Neural Networks | Chaoyu Liu et.al. | 2307.07344v1 | null |
2023-07-14 | High-density single-molecule maps reveal transient membrane receptor interactions within a dynamically varying environment | Nicolas Mateos et.al. | 2307.07334v1 | null |
2023-07-14 | Multimodal Motion Conditioned Diffusion Model for Skeleton-based Video Anomaly Detection | Alessandro Flaborea et.al. | 2307.07205v1 | link |
2023-07-14 | Federated Learning-Empowered AI-Generated Content in Wireless Networks | Xumin Huang et.al. | 2307.07146v1 | null |
2023-07-13 | HyperDreamBooth: HyperNetworks for Fast Personalization of Text-to-Image Models | Nataniel Ruiz et.al. | 2307.06949v1 | null |
2023-07-12 | Exposing the Fake: Effective Diffusion-Generated Images Detection | Ruipeng Ma et.al. | 2307.06272v1 | null |
2023-07-12 | Diffusion Based Multi-Agent Adversarial Tracking | Sean Ye et.al. | 2307.06244v1 | null |
2023-07-12 | Functional light diffusers based on hybrid CsPbBr $_3$/SiO$_2$ aero-framework structures for laser light illumination and conversion | Lena M. Saure et.al. | 2307.06197v1 | null |
2023-07-12 | Biofilm.jl: a fast solver for one-dimensional biofilm chemistry and ecology | Mark Owkes et.al. | 2307.06153v1 | link |
2023-07-11 | Metropolis Sampling for Constrained Diffusion Models | Nic Fishman et.al. | 2307.05439v1 | null |
2023-07-11 | On the Vulnerability of DeepFake Detectors to Attacks Generated by Denoising Diffusion Models | Marija Ivanovska et.al. | 2307.05397v1 | null |
2023-07-10 | Shelving, Stacking, Hanging: Relational Pose Diffusion for Multi-modal Rearrangement | Anthony Simeonov et.al. | 2307.04751v1 | null |
2023-07-10 | Divide, Evaluate, and Refine: Evaluating and Improving Text-to-Image Alignment with Iterative VQA Feedback | Jaskirat Singh et.al. | 2307.04749v1 | null |
2023-07-10 | Diffusion Policies for Out-of-Distribution Generalization in Offline Reinforcement Learning | Suzan Ece Ada et.al. | 2307.04726v1 | null |
2023-07-10 | AnimateDiff: Animate Your Personalized Text-to-Image Diffusion Models without Specific Tuning | Yuwei Guo et.al. | 2307.04725v1 | link |
2023-07-10 | Machine learning potentials with Iterative Boltzmann Inversion: training to experiment | Sakib Matin et.al. | 2307.04712v1 | null |
2023-07-10 | Encapsulation Structure and Dynamics in Hypergraphs | Timothy LaRock et.al. | 2307.04613v1 | link |
2023-07-07 | Three-dimensional Vorticity Effects on Extinction Behavior of Laminar Flamelets | Wes Hellwig et.al. | 2307.03695v1 | null |
2023-07-07 | Simulation-free Schrödinger bridges via score and flow matching | Alexander Tong et.al. | 2307.03672v1 | link |
2023-07-07 | IPO-LDM: Depth-aided 360-degree Indoor RGB Panorama Outpainting via Latent Diffusion Model | Tianhao Wu et.al. | 2307.03177v2 | null |
2023-07-06 | How to Detect Unauthorized Data Usages in Text-to-image Diffusion Models | Zhenting Wang et.al. | 2307.03108v1 | link |
2023-07-06 | Origin-Destination Travel Time Oracle for Map-based Services | Yan Lin et.al. | 2307.03048v1 | null |
2023-07-06 | Multi-modal multi-class Parkinson disease classification using CNN and decision level fusion | Sushanta Kumar Sahu et.al. | 2307.02978v1 | null |
2023-07-06 | On the Cultural Gap in Text-to-Image Generation | Bingshuai Liu et.al. | 2307.02971v1 | null |
2023-07-05 | DragonDiffusion: Enabling Drag-style Manipulation on Diffusion Models | Chong Mou et.al. | 2307.02421v1 | link |
2023-07-05 | RADiff: Controllable Diffusion Models for Radio Astronomical Maps Generation | Renato Sortino et.al. | 2307.02392v1 | null |
2023-07-06 | Error Approximation and Bias Correction in Dynamic Problems using a Recurrent Neural Network/Finite Element Hybrid Model | Moritz von Tresckow et.al. | 2307.02349v2 | null |
2023-07-05 | Detecting Images Generated by Deep Diffusion Models using their Local Intrinsic Dimensionality | Peter Lorenz et.al. | 2307.02347v1 | link |
2023-07-05 | SVDM: Single-View Diffusion Model for Pseudo-Stereo 3D Object Detection | Yuguang Shi et.al. | 2307.02270v1 | null |
2023-07-03 | Improved sampling via learned diffusions | Lorenz Richter et.al. | 2307.01198v1 | null |
2023-07-03 | Squeezing Large-Scale Diffusion Models for Mobile | Jiwoong Choi et.al. | 2307.01193v1 | null |
2023-07-03 | Learning Mixtures of Gaussians Using the DDPM Objective | Kulin Shah et.al. | 2307.01178v1 | null |
2023-07-03 | Investigating Data Memorization in 3D Latent Diffusion Models for Medical Image Synthesis | Salman Ul Hassan Dar et.al. | 2307.01148v1 | null |
2023-07-03 | A phase field-based framework for electro-chemo-mechanical fracture: crack-contained electrolytes, chemical reactions and stabilisation | T. Hageman et.al. | 2307.01105v1 | null |
2023-07-03 | MVDiffusion: Enabling Holistic Multi-view Image Generation with Correspondence-Aware Diffusion | Shitao Tang et.al. | 2307.01097v1 | link |
2023-07-03 | TomatoDIFF: On-plant Tomato Segmentation with Denoising Diffusion Models | Marija Ivanovska et.al. | 2307.01064v1 | link |
2023-06-30 | Magic123: One Image to High-Quality 3D Object Generation Using Both 2D and 3D Diffusion Priors | Guocheng Qian et.al. | 2306.17843v1 | link |
2023-06-30 | Content-Preserving Diffusion Model for Unsupervised AS-OCT image Despeckling | Li Sanqian et.al. | 2306.17717v1 | null |
2023-06-29 | Generate Anything Anywhere in Any Scene | Yuheng Li et.al. | 2306.17154v1 | null |
2023-06-29 | Filtered-Guided Diffusion: Fast Filter Guidance for Black-Box Diffusion Models | Zeqi Gu et.al. | 2306.17141v1 | link |
2023-06-29 | ID-Pose: Sparse-view Camera Pose Estimation by Inverting Diffusion Models | Weihao Cheng et.al. | 2306.17140v1 | null |
2023-06-29 | Learning Nuclei Representations with Masked Image Modelling | Piotr Wójcik et.al. | 2306.17116v1 | null |
2023-06-29 | Michelangelo: Conditional 3D Shape Generation based on Shape-Image-Text Aligned Latent Representation | Zibo Zhao et.al. | 2306.17115v1 | link |
2023-06-29 | Towards rapid extracellular vesicles colorimetric detection using optofluidics-enhanced color-changing optical metasurface | Chuchuan Hong et.al. | 2306.17102v1 | null |
2023-06-28 | DiffComplete: Diffusion-based Generative 3D Shape Completion | Ruihang Chu et.al. | 2306.16329v1 | null |
2023-06-28 | UnitSpeech: Speaker-adaptive Speech Synthesis with Untranscribed Data | Heeseung Kim et.al. | 2306.16083v1 | link |
2023-06-28 | PoseDiffusion: Solving Pose Estimation via Diffusion-aided Bundle Adjustment | Jianyuan Wang et.al. | 2306.15667v2 | null |
2023-06-27 | Stabilizing ultrathin Silver (Ag) films on different substrates | Allamula Ashok et.al. | 2306.15575v1 | null |
2023-06-27 | Trajectory Generation, Control, and Safety with Denoising Diffusion Probabilistic Models | Nicolò Botteghi et.al. | 2306.15512v1 | link |
2023-06-27 | Miniaturized gas-solid fluidized beds | Fernando David Cúñez Benalcázar et.al. | 2306.15463v1 | null |
2023-06-27 | Adversarial Training for Graph Neural Networks | Lukas Gosch et.al. | 2306.15427v1 | null |
2023-06-26 | Fuzzy-Conditioned Diffusion and Diffusion Projection Attention Applied to Facial Image Correction | Majed El Helou et.al. | 2306.14891v1 | link |
2023-06-26 | Restart Sampling for Improving Generative Processes | Yilun Xu et.al. | 2306.14878v1 | link |
2023-06-26 | ViNT: A Foundation Model for Visual Navigation | Dhruv Shah et.al. | 2306.14846v1 | null |
2023-06-26 | ProtoDiff: Learning to Learn Prototypical Networks by Task-Guided Diffusion | Yingjun Du et.al. | 2306.14770v1 | link |
2023-06-23 | Fast Macroscopic Forcing Method | Spencer H. Bryngelson et.al. | 2306.13625v1 | link |
2023-06-23 | DreamEditor: Text-Driven 3D Scene Editing with Neural Fields | Jingyu Zhuang et.al. | 2306.13455v1 | link |
2023-06-22 | Continuous Layout Editing of Single Images with Diffusion Models | Zhiyuan Zhang et.al. | 2306.13078v1 | null |
2023-06-22 | Towards More Realistic Membership Inference Attacks on Large Diffusion Models | Jan Dubiński et.al. | 2306.12983v1 | null |
2023-06-22 | On the nature of the two-positron bond: Evidence for a novel bond type | Mohammad Goli et.al. | 2306.12899v1 | null |
2023-06-22 | Stress-induced Artificial neuron spiking in Diffusive memristors | Debi Pattnaik et.al. | 2306.12853v1 | null |
2023-06-21 | DreamTime: An Improved Optimization Strategy for Text-to-3D Content Creation | Yukun Huang et.al. | 2306.12422v1 | null |
2023-06-21 | HumanDiffusion: diffusion model using perceptual gradients | Yota Ueda et.al. | 2306.12169v1 | null |
2023-06-20 | Learning Profitable NFT Image Diffusions via Multiple Visual-Policy Guided Reinforcement Learning | Huiguo He et.al. | 2306.11731v1 | null |
2023-06-20 | Diffusion with Forward Models: Solving Stochastic Inverse Problems Without Direct Supervision | Ayush Tewari et.al. | 2306.11719v1 | null |
2023-06-20 | Align, Adapt and Inject: Sound-guided Unified Image Generation | Yue Yang et.al. | 2306.11504v1 | null |
2023-06-20 | EMoG: Synthesizing Emotive Co-speech 3D Gesture with Diffusion Model | Lianying Yin et.al. | 2306.11496v1 | null |
2023-06-16 | Group Orthogonalization Regularization For Vision Models Adaptation and Robustness | Yoav Kurtz et.al. | 2306.10001v1 | link |
2023-06-16 | Towards Better Certified Segmentation via Diffusion Models | Othmane Laousy et.al. | 2306.09949v1 | null |
2023-06-16 | Unique information from common diffusion MRI models about white-matter differences across the human adult lifespan | Rafael Neto Henriques1 et.al. | 2306.09942v1 | link |
2023-06-16 | Drag-guided diffusion models for vehicle image generation | Nikos Arechiga et.al. | 2306.09935v1 | null |
2023-06-16 | Energy-Based Cross Attention for Bayesian Context Update in Text-to-Image Diffusion Models | Geon Yeong Park et.al. | 2306.09869v1 | link |
2023-06-16 | AvatarBooth: High-Quality and Customizable 3D Human Avatar Generation | Yifei Zeng et.al. | 2306.09864v1 | null |
2023-06-15 | Generative Proxemics: A Prior for 3D Social Interaction from Images | Lea Müller et.al. | 2306.09337v1 | link |
2023-06-15 | ArtFusion: Arbitrary Style Transfer using Dual Conditional Latent Diffusion Models | Dar-Yen Chen et.al. | 2306.09330v1 | link |
2023-06-15 | Diffusion Models for Zero-Shot Open-Vocabulary Segmentation | Laurynas Karazija et.al. | 2306.09316v1 | null |
2023-06-15 | Fast Training of Diffusion Models with Masked Transformers | Hongkai Zheng et.al. | 2306.09305v1 | link |
2023-06-15 | Conditional Human Sketch Synthesis with Explicit Abstraction Control | Dar-Yen Chen et.al. | 2306.09274v1 | null |
2023-06-15 | Training Diffusion Classifiers with Denoising Assistance | Chandramouli Sastry et.al. | 2306.09192v1 | null |
2023-06-13 | Rerender A Video: Zero-Shot Text-Guided Video-to-Video Translation | Shuai Yang et.al. | 2306.07954v1 | null |
2023-06-13 | Viewset Diffusion: (0-)Image-Conditioned 3D Generative Models from 2D Data | Stanislaw Szymanowicz et.al. | 2306.07881v1 | null |
2023-06-13 | Diffusive and convective dissolution of carbon dioxide in a vertical cylindrical cell | Daniël P. Faasen et.al. | 2306.07721v1 | null |
2023-06-12 | Controlling Text-to-Image Diffusion by Orthogonal Finetuning | Zeju Qiu et.al. | 2306.07280v1 | null |
2023-06-12 | MovieFactory: Automatic Movie Creation from Text using Large Generative Models for Language and Images | Junchen Zhu et.al. | 2306.07257v1 | null |
2023-06-12 | Diffusion Models for Black-Box Optimization | Siddarth Krishnamoorthy et.al. | 2306.07180v1 | link |
2023-06-12 | InstructP2P: Learning to Edit 3D Point Clouds with Text Instructions | Jiale Xu et.al. | 2306.07154v1 | null |
2023-06-12 | Latent Dynamical Implicit Diffusion Processes | Mohammad R. Rezaei et.al. | 2306.07077v1 | null |
2023-06-09 | Bridging Scales: a Hybrid Model to Simulate Vascular Tumor Growth and Treatment Response | Tobias Duswald et.al. | 2306.05994v1 | link |
2023-06-09 | DDLP: Unsupervised Object-Centric Video Prediction with Deep Dynamic Latent Particles | Tal Daniel et.al. | 2306.05957v1 | link |
2023-06-09 | Beyond Surface Statistics: Scene Representations in a Latent Diffusion Model | Yida Chen et.al. | 2306.05720v1 | link |
2023-06-12 | Boosting Fast and High-Quality Speech Synthesis with Linear Diffusion | Haogeng Liu et.al. | 2306.05708v2 | null |
2023-06-09 | RePaint-NeRF: NeRF Editting via Semantic Masks and Diffusion Models | Xingchen Zhou et.al. | 2306.05668v1 | null |
2023-06-08 | Grounded Text-to-Image Synthesis with Attention Refocusing | Quynh Phung et.al. | 2306.05427v1 | null |
2023-06-08 | ADDP: Learning General Representations for Image Recognition and Generation with Alternating Denoising Diffusion Process | Changyao Tian et.al. | 2306.05423v1 | null |
2023-06-08 | Stochastic Multi-Person 3D Motion Forecasting | Sirui Xu et.al. | 2306.05421v1 | link |
2023-06-08 | Improving Negative-Prompt Inversion via Proximal Guidance | Ligong Han et.al. | 2306.05414v1 | link |
2023-06-08 | PriSampler: Mitigating Property Inference of Diffusion Models | Hailong Hu et.al. | 2306.05208v1 | null |
2023-06-08 | SyncDiffusion: Coherent Montage via Synchronized Joint Diffusions | Yuseung Lee et.al. | 2306.05178v1 | null |
2023-06-07 | Designing a Better Asymmetric VQGAN for StableDiffusion | Zixin Zhu et.al. | 2306.04632v1 | link |
2023-06-07 | ARTIC3D: Learning Robust Articulated 3D Shapes from Noisy Web Image Collections | Chun-Han Yao et.al. | 2306.04619v1 | null |
2023-06-08 | Integrating Geometric Control into Text-to-Image Diffusion Models for High-Quality Detection Data Generation via Text Prompt | Kai Chen et.al. | 2306.04607v2 | null |
2023-06-07 | On the Design Fundamentals of Diffusion Models: A Survey | Ziyi Chang et.al. | 2306.04542v1 | null |
2023-06-07 | Multi-modal Latent Diffusion | Mustapha Bounoua et.al. | 2306.04445v1 | null |
2023-06-07 | Synthesizing realistic sand assemblies with denoising diffusion in latent space | Nikolaos N. Vlassis et.al. | 2306.04411v1 | null |
2023-06-07 | Improving Diffusion-based Image Translation using Asymmetric Gradient Guidance | Gihyun Kwon et.al. | 2306.04396v1 | link |
2023-06-06 | Emergent Correspondence from Image Diffusion | Luming Tang et.al. | 2306.03881v1 | link |
2023-06-06 | Conditional Diffusion Models for Weakly Supervised Medical Image Segmentation | Xinrong Hu et.al. | 2306.03878v1 | link |
2023-06-06 | Newly Formed Cities: an AI Curation | Dario Negueruela del Castillo et.al. | 2306.03753v1 | null |
2023-06-06 | Towards Visual Foundational Models of Physical Scenes | Chethan Parameshwara et.al. | 2306.03727v1 | null |
2023-06-06 | Diffusional exchange versus microscopic kurtosis from CTI: two conflicting interpretations of the same data | Arthur Chakwizira et.al. | 2306.03661v1 | null |
2023-06-05 | Brain Diffusion for Visual Exploration: Cortical Discovery using Large Scale Generative Models | Andrew F. Luo et.al. | 2306.03089v1 | null |
2023-06-05 | MotionDiffuser: Controllable Multi-Agent Motion Prediction using Diffusion | Chiyu Max Jiang et.al. | 2306.03083v1 | null |
2023-06-05 | Influence of the finite transverse size of the accelerating region on the relativistic feedback | Alexander Sedelnikov et.al. | 2306.03059v1 | null |
2023-06-05 | HeadSculpt: Crafting 3D Head Avatars with Text | Xiao Han et.al. | 2306.03038v1 | null |
2023-06-05 | Interpretable Alzheimer’s Disease Classification Via a Contrastive Diffusion Autoencoder | Ayodeji Ijishakin et.al. | 2306.03022v1 | link |
2023-06-05 | Complex Preferences for Different Convergent Priors in Discrete Graph Diffusion | Alex M. Tseng et.al. | 2306.02957v1 | null |
2023-06-05 | INDigo: An INN-Guided Probabilistic Diffusion Algorithm for Inverse Problems | Di You et.al. | 2306.02949v1 | null |
2023-06-05 | Instruct-Video2Avatar: Video-to-Avatar Generation with Instructions | Shaoxu Li et.al. | 2306.02903v1 | link |
2023-06-02 | Video Colorization with Pre-trained Text-to-Image Diffusion Models | Hanyuan Liu et.al. | 2306.01732v1 | null |
2023-06-02 | Denoising Diffusion Semantic Segmentation with Mask Prior Modeling | Zeqiang Lai et.al. | 2306.01721v1 | link |
2023-06-02 | DiffusEmp: A Diffusion Model-Based Framework with Multi-Grained Control for Empathetic Response Generation | Guanqun Bi et.al. | 2306.01657v1 | null |
2023-06-02 | Influence Maximization with Fairness at Scale (Extended Version) | Yuting Feng et.al. | 2306.01587v1 | null |
2023-06-02 | PolyDiffuse: Polygonal Shape Reconstruction via Guided Set Diffusion Models | Jiacheng Chen et.al. | 2306.01461v1 | link |
2023-06-02 | Diffusion Self-Guidance for Controllable Image Generation | Dave Epstein et.al. | 2306.00986v2 | null |
2023-06-01 | StableRep: Synthetic Images from Text-to-Image Models Make Strong Visual Representation Learners | Yonglong Tian et.al. | 2306.00984v1 | link |
2023-06-01 | StyleDrop: Text-to-Image Generation in Any Style | Kihyuk Sohn et.al. | 2306.00983v1 | null |
2023-06-01 | SnapFusion: Text-to-Image Diffusion Model on Mobile Devices within Two Seconds | Yanyu Li et.al. | 2306.00980v1 | link |
2023-06-01 | Intriguing Properties of Text-guided Diffusion Models | Qihao Liu et.al. | 2306.00974v1 | link |
2023-06-01 | Intelligent Grimm – Open-ended Visual Storytelling via Latent Diffusion Models | Chang Liu et.al. | 2306.00973v1 | link |
2023-06-01 | ViCo: Detail-Preserving Visual Condition for Personalized Text-to-Image Generation | Shaozhe Hao et.al. | 2306.00971v1 | link |
2023-06-01 | The Hidden Language of Diffusion Models | Hila Chefer et.al. | 2306.00966v1 | link |
2023-06-01 | Cocktail: Mixing Multi-Modality Controls for Text-Conditional Image Generation | Minghui Hu et.al. | 2306.00964v1 | null |
2023-06-01 | Differential Diffusion: Giving Each Pixel Its Strength | Eran Levin et.al. | 2306.00950v1 | link |
2023-05-31 | Learning Explicit Contact for Implicit Reconstruction of Hand-held Objects from Monocular Images | Junxing Hu et.al. | 2305.20089v1 | null |
2023-05-31 | Understanding and Mitigating Copying in Diffusion Models | Gowthami Somepalli et.al. | 2305.20086v1 | link |
2023-05-31 | Control4D: Dynamic Portrait Editing by Learning 4D GAN from 2D Diffusion-based Editor | Ruizhi Shao et.al. | 2305.20082v1 | null |
2023-05-31 | Efficient Diffusion Policies for Offline Reinforcement Learning | Bingyi Kang et.al. | 2305.20081v1 | link |
2023-05-31 | A Unified Conditional Framework for Diffusion-based Image Restoration | Yi Zhang et.al. | 2305.20049v1 | link |
2023-06-01 | Tree-Ring Watermarks: Fingerprints for Diffusion Images that are Invisible and Robust | Yuxin Wen et.al. | 2305.20030v2 | link |
2023-05-31 | Protein Design with Guided Discrete Diffusion | Nate Gruver et.al. | 2305.20009v1 | link |
2023-05-31 | GANDiffFace: Controllable Generation of Synthetic Datasets for Face Recognition with Realistic Variations | Pietro Melzi et.al. | 2305.19962v1 | null |
2023-05-31 | A Geometric Perspective on Diffusion Models | Defang Chen et.al. | 2305.19947v1 | null |
2023-05-30 | Ambient Diffusion: Learning Clean Distributions from Corrupted Data | Giannis Daras et.al. | 2305.19256v1 | link |
2023-05-30 | PanoGen: Text-Conditioned Panoramic Environment Generation for Vision-and-Language Navigation | Jialu Li et.al. | 2305.19195v1 | null |
2023-05-30 | Video ControlNet: Towards Temporally Consistent Synthetic-to-Real Video Translation Using Conditional Image Diffusion Models | Ernie Chu et.al. | 2305.19193v1 | null |
2023-05-30 | Calliffusion: Chinese Calligraphy Generation and Style Transfer with Diffusion Modeling | Qisheng Liao et.al. | 2305.19124v1 | null |
2023-05-30 | DiffMatch: Diffusion Model for Dense Matching | Jisu Nam et.al. | 2305.19094v1 | link |
2023-05-30 | Likelihood-Based Diffusion Language Models | Ishaan Gulrajani et.al. | 2305.18619v1 | link |
2023-05-29 | RAPHAEL: Text-to-Image Generation via Large Mixture of Diffusion Paths | Zeyue Xue et.al. | 2305.18295v1 | null |
2023-05-29 | Mix-of-Show: Decentralized Low-Rank Adaptation for Multi-Concept Customization of Diffusion Models | Yuchao Gu et.al. | 2305.18292v1 | link |
2023-05-29 | Photoswap: Personalized Subject Swapping in Images | Jing Gu et.al. | 2305.18286v1 | null |
2023-05-29 | Reconstructing the Mind’s Eye: fMRI-to-Image with Contrastive Learning and Diffusion Priors | Paul S. Scotti et.al. | 2305.18274v1 | link |
2023-05-29 | Gen-L-Video: Multi-Text to Long Video Generation via Temporal Co-Denoising | Fu-Yun Wang et.al. | 2305.18264v1 | link |
2023-05-29 | GlyphControl: Glyph Conditional Control for Visual Text Generation | Yukang Yang et.al. | 2305.18259v1 | link |
2023-05-26 | Improving accuracy of GPT-3/4 results on biomedical data using a retrieval-augmented language model | David Soong et.al. | 2305.17116v1 | null |
2023-05-26 | ControlVideo: Adding Conditional Control for One Shot Text-to-Video Editing | Min Zhao et.al. | 2305.17098v1 | link |
2023-05-26 | The reaction-diffusion basis of animated patterns in eukaryotic flagella | James F. Cass et.al. | 2305.17032v1 | link |
2023-05-26 | Accelerating Diffusion Models for Inverse Problems through Shortcut Sampling | Gongye Liu et.al. | 2305.16965v1 | link |
2023-05-26 | Learning to Imagine: Visually-Augmented Natural Language Generation | Tianyi Tang et.al. | 2305.16944v1 | link |
2023-05-26 | DiffusionNAG: Task-guided Neural Architecture Generation with Diffusion Models | Sohyun An et.al. | 2305.16943v1 | link |
2023-05-26 | CRoSS: Diffusion Model Makes Controllable, Robust and Secure Image Steganography | Jiwen Yu et.al. | 2305.16936v1 | link |
2023-05-26 | Turbulence calculation based on the extended Naiver-Stokes equations | Shanwen Tan et.al. | 2305.16923v1 | null |
2023-05-25 | Uni-ControlNet: All-in-One Control to Text-to-Image Diffusion Models | Shihao Zhao et.al. | 2305.16322v1 | link |
2023-05-25 | Eclipse: Disambiguating Illumination and Materials using Unintended Shadows | Dor Verbin et.al. | 2305.16321v1 | null |
2023-05-25 | Parallel Sampling of Diffusion Models | Andy Shih et.al. | 2305.16317v1 | link |
2023-05-25 | NAP: Neural 3D Articulation Prior | Jiahui Lei et.al. | 2305.16315v1 | null |
2023-05-25 | UMat: Uncertainty-Aware Single Image High Resolution Material Capture | Carlos Rodriguez-Pardo et.al. | 2305.16312v1 | null |
2023-05-25 | Break-A-Scene: Extracting Multiple Concepts from a Single Image | Omri Avrahami et.al. | 2305.16311v1 | link |
2023-05-25 | Look Ma, No Hands! Agent-Environment Factorization of Egocentric Videos | Matthew Chang et.al. | 2305.16301v1 | null |
2023-05-25 | Diversify Your Vision Datasets with Automatic Diffusion-Based Augmentation | Lisa Dunlap et.al. | 2305.16289v1 | link |
2023-05-25 | CommonScenes: Generating Commonsense 3D Indoor Scenes with Scene Graphs | Guangyao Zhai et.al. | 2305.16283v1 | link |
2023-05-25 | UDPM: Upsampling Diffusion Probabilistic Models | Shady Abu-Hussein et.al. | 2305.16269v1 | link |
2023-05-24 | Sin3DM: Learning a Diffusion Model from a Single 3D Textured Shape | Rundi Wu et.al. | 2305.15399v1 | link |
2023-05-24 | A Tale of Two Features: Stable Diffusion Complements DINO for Zero-Shot Semantic Correspondence | Junyi Zhang et.al. | 2305.15347v1 | link |
2023-05-24 | Training on Thin Air: Improve Image Classification with Generated Data | Yongchao Zhou et.al. | 2305.15316v1 | link |
2023-05-24 | MultiFusion: Fusing Pre-Trained Models for Multi-Lingual, Multi-Modal Image Generation | Marco Bellagente et.al. | 2305.15296v1 | null |
2023-05-23 | Diffusion Hyperfeatures: Searching Through Time and Space for Semantic Correspondence | Grace Luo et.al. | 2305.14334v1 | null |
2023-05-23 | SEEDS: Exponential SDE Solvers for Fast High-Quality Sampling from Diffusion Models | Martin Gonzalez et.al. | 2305.14267v1 | link |
2023-05-23 | Improved Convergence of Score-Based Diffusion Models via Prediction-Correction | Francesco Pedrotti et.al. | 2305.14164v1 | null |
2023-05-23 | Realistic Noise Synthesis with Diffusion Models | Qi Wu et.al. | 2305.14022v1 | null |
2023-05-23 | Lightweight Channel Codes for ISI Mitigation in Molecular Communication between Bionanosensors | Dongliang Jing et.al. | 2305.14001v1 | null |
2023-05-23 | Node-wise Diffusion for Scalable Graph Learning | Keke Huang et.al. | 2305.14000v1 | link |
2023-05-22 | VDT: An Empirical Study on Video Diffusion with Transformers | Haoyu Lu et.al. | 2305.13311v1 | link |
2023-05-22 | If at First You Don’t Succeed, Try, Try Again: Faithful Diffusion-based Text-to-Image Generation by Selection | Shyamgopal Karthik et.al. | 2305.13308v1 | link |
2023-05-23 | Training Diffusion Models with Reinforcement Learning | Kevin Black et.al. | 2305.13301v2 | link |
2023-05-22 | DiffusionNER: Boundary Diffusion for Named Entity Recognition | Yongliang Shen et.al. | 2305.13298v1 | link |
2023-05-22 | U-DiT TTS: U-Diffusion Vision Transformer for Text-to-Speech | Xin Jing et.al. | 2305.13195v1 | null |
2023-05-22 | Policy Representation via Diffusion Probability Model for Reinforcement Learning | Long Yang et.al. | 2305.13122v1 | link |
2023-05-22 | Energy cascade in the Garrett-Munk spectrum of internal gravity waves | Yue Wu et.al. | 2305.13110v1 | null |
2023-05-19 | Chupa: Carving 3D Clothed Humans from Skinned Shape Priors using 2D Diffusion Probabilistic Models | Byungjun Kim et.al. | 2305.11870v1 | link |
2023-05-19 | Any-to-Any Generation via Composable Diffusion | Zineng Tang et.al. | 2305.11846v1 | link |
2023-05-19 | The probability flow ODE is provably fast | Sitan Chen et.al. | 2305.11798v1 | null |
2023-05-19 | Cinematic Mindscapes: High-quality Video Reconstruction from Brain Activity | Zijiao Chen et.al. | 2305.11675v1 | null |
2023-05-19 | Few-shot 3D Shape Generation | Jingyuan Zhu et.al. | 2305.11664v1 | null |
2023-05-19 | Text2NeRF: Text-Driven 3D Scene Generation with Neural Radiance Fields | Jingbo Zhang et.al. | 2305.11588v1 | link |
2023-05-19 | Brain Captioning: Decoding human brain activity into images and text | Matteo Ferrante et.al. | 2305.11560v1 | null |
2023-05-19 | Efficient Cross-Lingual Transfer for Chinese Stable Diffusion with Images as Pivots | Jinyi Hu et.al. | 2305.11540v1 | null |
2023-05-19 | Late-Constraint Diffusion Guidance for Controllable Image Synthesis | Chang Liu et.al. | 2305.11520v1 | link |
2023-05-19 | DiffuSIA: A Spiral Interaction Architecture for Encoder-Decoder Text Diffusion | Chao-Hong Tan et.al. | 2305.11517v1 | null |
2023-05-18 | UniControl: A Unified Diffusion Model for Controllable Visual Generation In the Wild | Can Qin et.al. | 2305.11147v1 | link |
2023-05-18 | Blackout Diffusion: Generative Diffusion Models in Discrete-State Spaces | Javier E Santos et.al. | 2305.11089v1 | link |
2023-05-18 | Inspecting the Geographical Representativeness of Images from Text-to-Image Models | Abhipsa Basu et.al. | 2305.11080v1 | null |
2023-05-18 | Unsupervised Pansharpening via Low-rank Diffusion Model | Xiangyu Rui et.al. | 2305.10925v1 | link |
2023-05-18 | Structural Pruning for Diffusion Models | Gongfan Fang et.al. | 2305.10924v1 | link |
2023-05-18 | VideoFactory: Swap Attention in Spatiotemporal Diffusions for Text-to-Video Generation | Wenjing Wang et.al. | 2305.10874v1 | null |
2023-05-17 | FastComposer: Tuning-Free Multi-Subject Image Generation with Localized Attention | Guangxuan Xiao et.al. | 2305.10431v1 | link |
2023-05-17 | Raising the Bar for Certified Adversarial Robustness with Diffusion Models | Thomas Altstidl et.al. | 2305.10388v1 | null |
2023-05-17 | A phase field model for droplets suspended in viscous liquids under the influence of electric fields | Yuzhe Qin et.al. | 2305.10296v1 | null |
2023-05-17 | Provably Correct Physics-Informed Neural Networks | Francisco Eiras et.al. | 2305.10157v1 | null |
2023-05-18 | Controllable Mind Visual Diffusion Model | Bohan Zeng et.al. | 2305.10135v2 | link |
2023-05-16 | Make-An-Animation: Large-Scale Text-conditional 3D Human Motion Generation | Samaneh Azadi et.al. | 2305.09662v1 | null |
2023-05-16 | FitMe: Deep Photorealistic 3D Morphable Model Avatars | Alexandros Lattas et.al. | 2305.09641v1 | null |
2023-05-16 | AR-Diffusion: Auto-Regressive Diffusion Model for Text Generation | Tong Wu et.al. | 2305.09515v1 | link |
2023-05-16 | Discrete Diffusion Probabilistic Models for Symbolic Music Generation | Matthias Plasser et.al. | 2305.09489v1 | link |
2023-05-17 | Multi-Level Global Context Cross Consistency Model for Semi-Supervised Ultrasound Image Segmentation with Diffusion Model | Fenghe Tang et.al. | 2305.09447v2 | link |
2023-05-16 | Diffusion Dataset Generation: Towards Closing the Sim2Real Gap for Pedestrian Detection | Andrew Farley et.al. | 2305.09401v1 | null |
2023-05-17 | AMD: Autoregressive Motion Diffusion | Bo Han et.al. | 2305.09381v2 | null |
2023-05-15 | Laughing Matters: Introducing Laughing-Face Generation using Diffusion Models | Antoni Bigata Casademunt et.al. | 2305.08854v1 | link |
2023-05-15 | Make-A-Protagonist: Generic Video Editing with An Ensemble of Experts | Yuyang Zhao et.al. | 2305.08850v1 | null |
2023-05-15 | The role of magnetic helicity when it is absent on average | Axel Brandenburg et.al. | 2305.08769v1 | null |
2023-05-15 | Diffusion-weighted SPECIAL improves the detection of J-coupled metabolites at ultra-high magnetic field | Jessie Mosso et.al. | 2305.08708v1 | null |
2023-05-15 | A Reproducible Extraction of Training Images from Diffusion Models | Ryan Webster et.al. | 2305.08694v1 | link |
2023-05-12 | Sound waves, diffusive transport, and wall slip in nanoconfined compressible fluids | Hannes Holey et.al. | 2305.07501v1 | null |
2023-05-12 | On a Voter Model with Context-Dependent Opinion Adoption | Luca Becchetti et.al. | 2305.07377v1 | null |
2023-05-12 | Experimental optimization of lensless digital holographic microscopy with rotating diffuser-based coherent noise reduction | Piotr Arcab et.al. | 2305.07373v1 | null |
2023-05-12 | Penguin huddling: a continuum model | Samuel J. Harris et.al. | 2305.07324v1 | link |
2023-05-15 | Phosphorus-Controlled Nanoepitaxy in the Asymmetric Growth of GaAs-InP Core-Shell Bent Nanowires | Spencer McDermott et.al. | 2305.07252v2 | null |
2023-05-12 | Optimal calibration of optical tweezers with arbitrary integration time and sampling frequencies – A general framework | Laura Pérez-Garcéa et.al. | 2305.07245v1 | null |
2023-05-15 | Fully quantum algorithm for lattice Boltzmann methods with application to partial differential equations | Fatima Ezahra Chrit et.al. | 2305.07148v2 | link |
2023-05-11 | Exploiting Diffusion Prior for Real-World Image Super-Resolution | Jianyi Wang et.al. | 2305.07015v1 | link |
2023-05-11 | A method for automated regression test in scientific computing libraries: illustration with SPHinXsys | Bo Zhang et.al. | 2305.06970v1 | link |
2023-05-11 | CoMoSpeech: One-Step Speech and Singing Voice Synthesis via Consistency Model | Zhen Ye et.al. | 2305.06908v1 | link |
2023-05-11 | Null-text Guidance in Diffusion Models is Secretly a Cartoon-style Creator | Jing Zhao et.al. | 2305.06710v1 | null |
2023-05-11 | Evaluating Twitter’s Algorithmic Amplification of Low-Trust Content: An Observational Study | Giulio Corsi et.al. | 2305.06125v2 | link |
2023-05-10 | Relightify: Relightable 3D Faces from a Single Image via Diffusion Models | Foivos Paraperas Papantoniou et.al. | 2305.06077v1 | null |
2023-05-10 | iEdit: Localised Text-guided Image Editing with Weak Supervision | Rumeysa Bodur et.al. | 2305.05947v1 | null |
2023-05-09 | Large Language Models Humanize Technology | Pratyush Kumar et.al. | 2305.05576v1 | null |
2023-05-09 | Style-A-Video: Agile Diffusion for Arbitrary Text-based Video Style Transfer | Nisha Huang et.al. | 2305.05464v1 | link |
2023-05-10 | Large Language Models Need Holistically Thought in Medical Conversational QA | Yixuan Weng et.al. | 2305.05410v2 | link |
2023-05-09 | The Multi-cluster Two-Wave Fading Model | Juan P. Pena-Martin et.al. | 2305.05342v1 | null |
2023-05-08 | DiffuseStyleGesture: Stylized Audio-Driven Co-Speech Gesture Generation with Diffusion Models | Sicheng Yang et.al. | 2305.04919v1 | link |
2023-05-08 | CaloClouds: Fast Geometry-Independent Highly-Granular Calorimeter Simulation | Erik Buhmann et.al. | 2305.04847v1 | link |
2023-05-08 | A Drop of Ink may Make a Million Think: The Spread of False Information in Large Language Models | Ning Bian et.al. | 2305.04812v1 | null |
2023-05-08 | Controllable Light Diffusion for Portraits | David Futschik et.al. | 2305.04745v1 | null |
2023-05-08 | A Closest Point Method for Surface PDEs with Interior Boundary Conditions for Geometry Processing | Nathan King et.al. | 2305.04711v1 | null |
2023-05-08 | ReGeneration Learning of Diffusion Models with Rich Prompts for Zero-Shot Image Translation | Yupei Lin et.al. | 2305.04651v1 | null |
2023-05-05 | Reflection of a Diffuser in a Liquid Interface | C. Silva et.al. | 2305.03682v1 | null |
2023-05-05 | Conditional Diffusion Feature Refinement for Continuous Sign Language Recognition | Leming Guo et.al. | 2305.03614v1 | null |
2023-05-05 | Data Curation for Image Captioning with Text-to-Image Generative Models | Wenyan Li et.al. | 2305.03610v1 | link |
2023-05-04 | Personalize Segment Anything Model with One Shot | Renrui Zhang et.al. | 2305.03048v1 | link |
2023-05-05 | Capacity Bounds for Vertically-Drifted First Arrival Position Channels under a Second-Moment Constraint | Yun-Feng Lo et.al. | 2305.02706v2 | null |
2023-05-03 | Nonlocal gravity wave turbulence in presence of condensate | A. O. Korotkevich et.al. | 2305.01930v1 | null |
2023-05-04 | DiffFacto: Controllable Part-Based 3D Point Cloud Generation with Cross Diffusion | Kiyohiro Nakayama et.al. | 2305.01921v2 | null |
2023-05-04 | The Impacts of Dimensionality, Diffusion, and Directedness on Intrinsic Cross-Model Simulation in Tile-Based Self-Assembly | Daniel Hader et.al. | 2305.01877v2 | null |
2023-05-03 | Multimodal Data Augmentation for Image Captioning using Diffusion Models | Changrong Xiao et.al. | 2305.01855v1 | link |
2023-05-02 | Unpaired Downscaling of Fluid Flows with Diffusion Bridges | Tobias Bischoff et.al. | 2305.01822v1 | link |
2023-05-02 | Multimodal Procedural Planning via Dual Text-Image Prompting | Yujie Lu et.al. | 2305.01795v1 | link |
2023-05-02 | DiffuSum: Generation Enhanced Extractive Summarization with Diffusion | Haopeng Zhang et.al. | 2305.01735v1 | link |
2023-05-02 | ContactArt: Learning 3D Interaction Priors for Category-level Articulated Object and Hand Poses Estimation | Zehao Zhu et.al. | 2305.01618v1 | null |
2023-05-02 | Adopting AI: How Familiarity Breeds Both Trust and Contempt | Michael C. Horowitz et.al. | 2305.01405v1 | null |
2023-05-02 | Long-Term Rhythmic Video Soundtracker | Jiashuo Yu et.al. | 2305.01319v1 | link |
2023-05-02 | DreamPaint: Few-Shot Inpainting of E-Commerce Items for Virtual Try-On without 3D Modeling | Mehmet Saygin Seyfioglu et.al. | 2305.01257v1 | null |
2023-05-02 | Solving Inverse Problems with Score-Based Generative Priors learned from Noisy Data | Asad Aali et.al. | 2305.01166v1 | null |
2023-05-02 | Geometric Latent Diffusion Models for 3D Molecule Generation | Minkai Xu et.al. | 2305.01140v1 | link |
2023-05-01 | Fractional and tempered fractional models for Reynolds-averaged Navier-Stokes equations | Pavan Pranjivan Mehta et.al. | 2305.00770v1 | null |
2023-05-01 | Diffusion Models for Time Series Applications: A Survey | Lequan Lin et.al. | 2305.00624v1 | null |
2023-04-30 | Class-Balancing Diffusion Models | Yiming Qin et.al. | 2305.00562v1 | link |
2023-04-30 | Towards Computational Architecture of Liberty: A Comprehensive Survey on Deep Learning for Generating Virtual Architecture in the Metaverse | Anqi Wang et.al. | 2305.00510v1 | null |
2023-04-28 | Scaling regimes in rapidly rotating thermal convection at extreme Rayleigh numbers | Jiaxing Song et.al. | 2304.14854v1 | null |
2023-04-28 | Simplified models of diffusion in radially-symmetric geometries | Luke P. Filippini et.al. | 2304.14632v1 | link |
2023-04-28 | MUDiff: Unified Diffusion for Complete Molecule Generation | Chenqing Hua et.al. | 2304.14621v1 | null |
2023-04-28 | Robust Gaussian Process Regression method for efficient reaction pathway optimization: application to surface processes | Wei Fang et.al. | 2304.14596v1 | null |
2023-04-28 | SceneGenie: Scene Graph Guided Diffusion Models for Image Synthesis | Azade Farshad et.al. | 2304.14573v1 | null |
2023-04-27 | It is all about where you start: Text-to-image generation with seed selection | Dvir Samuel et.al. | 2304.14530v1 | link |
2023-04-27 | Putting People in Their Place: Affordance-Aware Human Insertion into Scenes | Sumith Kulal et.al. | 2304.14406v1 | link |
2023-04-27 | Motion-Conditioned Diffusion Model for Controllable Video Synthesis | Tsai-Shien Chen et.al. | 2304.14404v1 | null |
2023-04-27 | Maximizing Model Generalization for Manufacturing with Self-Supervised Learning and Federated Learning | Matthew Russell et.al. | 2304.14398v1 | null |
2023-04-27 | Functional Diffusion Maps | María Barroso et.al. | 2304.14378v1 | link |
2023-04-27 | LDPC Decoders Prefer More Reliable Parity Bits: Unequal Data Protection Over BSC | Beyza Dabak et.al. | 2304.14278v1 | null |
2023-04-27 | DataComp: In search of the next generation of multimodal datasets | Samir Yitzhak Gadre et.al. | 2304.14108v1 | link |
2023-04-26 | Heuristic Barycenter Modeling of Fully Absorbing Receivers in Diffusive Molecular Communication Channels | Fardad Vakilipoor et.al. | 2304.13640v1 | null |
2023-04-26 | Identifying the structure patterns to govern the performance of localization in regulating innovation diffusion | Leyang Xue et.al. | 2304.13608v1 | null |
2023-04-26 | Bifractality of fractal scale-free networks | Jun Yamamoto et.al. | 2304.13438v1 | null |
2023-04-26 | Training-Free Location-Aware Text-to-Image Synthesis | Jiafeng Mao et.al. | 2304.13427v1 | null |
2023-04-25 | The Score-Difference Flow for Implicit Generative Modeling | Romann M. Weber et.al. | 2304.12906v1 | null |
2023-04-25 | Latent diffusion models for generative precipitation nowcasting with accurate uncertainty quantification | Jussi Leinonen et.al. | 2304.12891v1 | link |
2023-04-25 | Contrastive Energy Prediction for Exact Energy-Guided Diffusion Sampling in Offline Reinforcement Learning | Cheng Lu et.al. | 2304.12824v1 | link |
2023-04-25 | A Binary Annular Phase Mask to Regulate Spherical Aberration and Allow Super-Localization in Single-Particle Tracking over Extended Depth-of-Focus | Quentin Gresil et.al. | 2304.12774v1 | null |
2023-04-25 | Effect of trap states, ion migration and interfaces on carrier transport in single crystal, polycrystalline and thick film devices of halide perovskites CH $_3$NH$_3$PbX$_3$ (X= I, Br, Cl) | Mohd Warish et.al. | 2304.12701v1 | null |
2023-04-24 | Analyzing the neutron and $γ$ -ray emission properties of an americium-beryllium tagged neutron source | Hiroshi Ito et.al. | 2304.12153v1 | null |
2023-04-24 | Efficient Halftoning via Deep Reinforcement Learning | Haitian Jiang et.al. | 2304.12152v1 | null |
2023-04-24 | Variational Diffusion Auto-encoder: Deep Latent Variable Model with Unconditional Diffusion Prior | Georgios Batzolis et.al. | 2304.12141v1 | null |
2023-04-24 | Customized Load Profiles Synthesis for Electricity Customers Based on Conditional Diffusion Models | Zhenyi Wang et.al. | 2304.12076v1 | null |
2023-04-24 | Improving Synthetically Generated Image Detection in Cross-Concept Settings | Pantelis Dogoulis et.al. | 2304.12053v1 | link |
2023-04-21 | BoDiffusion: Diffusing Sparse Observations for Full-Body Human Motion Synthesis | Angela Castillo et.al. | 2304.11118v1 | null |
2023-04-21 | Improved Diffusion-based Image Colorization via Piggybacked Models | Hanyuan Liu et.al. | 2304.11105v1 | null |
2023-04-21 | Perturbatively corrected ring-polymer instanton theory for accurate tunneling splittings | Joseph E. Lawrence et.al. | 2304.10963v1 | null |
2023-04-20 | Farm3D: Learning Articulated 3D Animals by Distilling 2D Diffusion | Tomas Jakab et.al. | 2304.10535v1 | null |
2023-04-20 | Nerfbusters: Removing Ghostly Artifacts from Casually Captured NeRFs | Frederik Warburg et.al. | 2304.10532v1 | link |
2023-04-20 | Collaborative Diffusion for Multi-Modal Face Generation and Editing | Ziqi Huang et.al. | 2304.10530v1 | link |
2023-04-20 | Prediction of the evolution of the nuclear reactor core parameters using artificial neural network | Krzysztof Palmi et.al. | 2304.10337v1 | null |
2023-04-20 | Avoiding methane emission rate underestimates when using the divergence method | Clayton Roberts et.al. | 2304.10303v1 | null |
2023-04-20 | Not Only Generative Art: Stable Diffusion for Content-Style Disentanglement in Art Analysis | Yankun Wu et.al. | 2304.10278v1 | link |
2023-04-19 | Irregular dependence on Stokes number and non-ergodic transport of heavy inertial particles in steady laminar flows | Anu V. S. Nath et.al. | 2304.09804v1 | null |
2023-04-19 | NeuralField-LDM: Scene Generation with Hierarchical Latent Diffusion Models | Seung Wook Kim et.al. | 2304.09787v1 | null |
2023-04-19 | Signatures of heterogeneity in the statistical structure of target state aligned ensembles | Nicolas Lenner et.al. | 2304.09719v1 | null |
2023-04-18 | Monte-Carlo method for incompressible fluid flows past obstacles | Vladislav Cherepanov et.al. | 2304.09152v1 | null |
2023-04-18 | On the seed population of solar energetic particles in the inner heliosphere | Nicolas Wijsen et.al. | 2304.09098v1 | null |
2023-04-18 | Construction of coarse-grained molecular dynamics with many-body non-Markovian memory | Liyao Lyu et.al. | 2304.09044v1 | null |
2023-04-18 | Look ATME: The Discriminator Mean Entropy Needs Attention | Edgardo Solano-Carrillo et.al. | 2304.09024v1 | link |
2023-04-18 | UPGPT: Universal Diffusion Model for Person Image Generation, Editing and Pose Transfer | Soon Yau Cheong et.al. | 2304.08870v1 | link |
2023-04-17 | Text2Performer: Text-Driven Human Video Generation | Yuming Jiang et.al. | 2304.08483v1 | link |
2023-04-18 | Latent-Shift: Latent Diffusion with Temporal Shift for Efficient Text-to-Video Generation | Jie An et.al. | 2304.08477v2 | null |
2023-04-17 | Synthetic Data from Diffusion Models Improves ImageNet Classification | Shekoofeh Azizi et.al. | 2304.08466v1 | null |
2023-04-17 | MasaCtrl: Tuning-Free Mutual Self-Attention Control for Consistent Image Synthesis and Editing | Mingdeng Cao et.al. | 2304.08465v1 | link |
2023-04-17 | OVTrack: Open-Vocabulary Multiple Object Tracking | Siyuan Li et.al. | 2304.08408v1 | null |
2023-04-17 | Refusion: Enabling Large-Size Realistic Image Restoration with Latent-Space Diffusion Models | Ziwei Luo et.al. | 2304.08291v1 | link |
2023-04-17 | Solving stiff ordinary differential equations using physics informed neural networks (PINNs): simple recipes to improve training of vanilla-PINNs | Hubert Baty et.al. | 2304.08289v1 | link |
2023-04-14 | A Comparative Study on Generative Models for High Resolution Solar Observation Imaging | Mehdi Cherti et.al. | 2304.07169v1 | link |
2023-04-14 | Towards Controllable Diffusion Models via Reward-Guided Exploration | Hengtong Zhang et.al. | 2304.07132v1 | null |
2023-04-14 | Delta Denoising Score | Amir Hertz et.al. | 2304.07090v1 | null |
2023-04-14 | Memory Efficient Diffusion Probabilistic Models via Patch-based Generation | Shinei Arakawa et.al. | 2304.07087v1 | null |
2023-04-14 | DCFace: Synthetic Face Generation with Dual Condition Diffusion Model | Minchul Kim et.al. | 2304.07060v1 | link |
2023-04-14 | A Diffusion model for POI recommendation | Yifang Qin et.al. | 2304.07041v1 | link |
2023-04-13 | Expressive Text-to-Image Generation with Rich Text | Songwei Ge et.al. | 2304.06720v1 | null |
2023-04-13 | Single-Stage Diffusion NeRF: A Unified Approach to 3D Generation and Reconstruction | Hansheng Chen et.al. | 2304.06714v1 | link |
2023-04-13 | DiffusionRig: Learning Personalized Priors for Facial Appearance Editing | Zheng Ding et.al. | 2304.06711v1 | link |
2023-04-13 | Learning Controllable 3D Diffusion Models from Single-view Images | Jiatao Gu et.al. | 2304.06700v1 | null |
2023-04-13 | DiffFit: Unlocking Transferability of Large Diffusion Models via Simple Parameter-Efficient Fine-Tuning | Enze Xie et.al. | 2304.06648v1 | null |
2023-04-12 | Continual Diffusion: Continual Customization of Text-to-Image Diffusion with C-LoRA | James Seale Smith et.al. | 2304.06027v1 | null |
2023-04-12 | DreamPose: Fashion Image-to-Video Synthesis via Stable Diffusion | Johanna Karras et.al. | 2304.06025v1 | null |
2023-04-12 | Probabilistic Human Mesh Recovery in 3D Scenes from Egocentric Views | Siwei Zhang et.al. | 2304.06024v1 | link |
2023-04-12 | SpectralDiff: Hyperspectral Image Classification with Spectral-Spatial Diffusion Models | Ning Chen et.al. | 2304.05961v1 | link |
2023-04-12 | Diffusion models with location-scale noise | Alexia Jolicoeur-Martineau et.al. | 2304.05907v1 | null |
2023-04-12 | Cancer-Net BCa-S: Breast Cancer Grade Prediction using Volumetric Deep Radiomic Features from Synthetic Correlated Diffusion Imaging | Chi-en Amy Tai et.al. | 2304.05899v1 | link |
2023-04-11 | HRS-Bench: Holistic, Reliable and Scalable Benchmark for Text-to-Image Models | Eslam Mohamed Bakr et.al. | 2304.05390v1 | link |
2023-04-11 | Diffusion Models for Constrained Domains | Nic Fishman et.al. | 2304.05364v1 | link |
2023-04-11 | Multi-scale Fusion Fault Diagnosis Method Based on Two-Dimensionaliztion Sequence in Complex Scenarios | Weiyang Jin et.al. | 2304.05198v1 | null |
2023-04-10 | A Cheaper and Better Diffusion Language Model with Soft-Masked Noise | Jiaao Chen et.al. | 2304.04746v1 | link |
2023-04-10 | Ambiguous Medical Image Segmentation using Diffusion Models | Aimon Rahman et.al. | 2304.04745v1 | link |
2023-04-10 | Sequential Recommendation with Diffusion Models | Hanwen Du et.al. | 2304.04541v1 | null |
2023-04-07 | Compressed Regression over Adaptive Networks | Marco Carpentiero et.al. | 2304.03638v1 | null |
2023-04-07 | Exploring Collaborative Distributed Diffusion-Based AI-Generated Content (AIGC) in Wireless Networks | Hongyang Du et.al. | 2304.03446v1 | link |
2023-04-06 | RoSteALS: Robust Steganography using Autoencoder Latent Space | Tu Bui et.al. | 2304.03400v1 | link |
2023-04-06 | Diffusion Models as Masked Autoencoders | Chen Wei et.al. | 2304.03283v1 | null |
2023-04-06 | Inst-Inpaint: Instructing to Remove Objects with Diffusion Models | Ahmet Burak Yildirim et.al. | 2304.03246v1 | link |
2023-04-06 | Face Animation with an Attribute-Guided Diffusion Model | Bohan Zeng et.al. | 2304.03199v1 | link |
2023-04-06 | SketchFFusion: Sketch-guided image editing with diffusion model | Weihang Mao et.al. | 2304.03174v1 | null |
2023-04-05 | Taming Encoder for Zero Fine-tuning Image Customization with Text-to-Image Diffusion Models | Xuhui Jia et.al. | 2304.02642v1 | null |
2023-04-05 | GenPhys: From Physical Processes to Generative Models | Ziming Liu et.al. | 2304.02637v1 | null |
2023-04-05 | An atlas of the heterogeneous viscoelastic brain with local power-law attenuation synthesised using Prony-series | Oisin Morrison et.al. | 2304.02610v1 | null |
2023-04-05 | Generative Novel View Synthesis with 3D-Aware Diffusion Models | Eric R. Chan et.al. | 2304.02602v1 | null |
2023-04-05 | Diffusion across a concentration step: Strongly nonmonotonic evolution into thermodynamic equilibrium | Hans R. Moser et.al. | 2304.02557v1 | null |
2023-04-04 | viz2viz: Prompt-driven stylized visualization generation using a diffusion model | Jiaqi Wu et.al. | 2304.01919v1 | null |
2023-04-04 | PODIA-3D: Domain Adaptation of 3D Generative Model Across Large Domain Gap Using Pose-Preserved Text-to-Image Diffusion | Gwanghyun Kim et.al. | 2304.01900v1 | null |
2023-04-04 | Trace and Pace: Controllable Pedestrian Animation via Guided Trajectory Diffusion | Davis Rempe et.al. | 2304.01893v1 | null |
2023-04-04 | Quantitative perfusion and water transport time model from multi b-value diffusion magnetic resonance imaging validated against neutron capture microspheres | M. Liu et.al. | 2304.01888v1 | null |
2023-04-04 | Adaptive learning of effective dynamics: Adaptive real-time, online modeling for complex systems | Ivica Kičić et.al. | 2304.01732v1 | link |
2023-04-03 | Learning to Read Braille: Bridging the Tactile Reality Gap with Diffusion Models | Carolina Higuera et.al. | 2304.01182v1 | link |
2023-04-03 | ReMoDiffuse: Retrieval-Augmented Motion Diffusion Model | Mingyuan Zhang et.al. | 2304.01116v1 | link |
2023-04-03 | ViT-DAE: Transformer-driven Diffusion Autoencoder for Histopathology Image Analysis | Xuan Xu et.al. | 2304.01053v1 | null |
2023-04-03 | DreamAvatar: Text-and-Shape Guided 3D Human Avatar Generation via Diffusion Models | Yukang Cao et.al. | 2304.00916v1 | link |
2023-03-31 | $\infty$ -Diff: Infinite Resolution Diffusion with Subsampled Mollified States | Sam Bond-Taylor et.al. | 2303.18242v1 | link |
2023-03-31 | A Closer Look at Parameter-Efficient Tuning in Diffusion Models | Chendong Xiang et.al. | 2303.18181v1 | link |
2023-03-31 | One-shot Unsupervised Domain Adaptation with Personalized Diffusion Models | Yasser Benigmim et.al. | 2303.18080v1 | link |
2023-03-30 | AvatarCraft: Transforming Text into Neural Human Avatars with Parameterized Shape and Pose Control | Ruixiang Jiang et.al. | 2303.17606v1 | link |
2023-03-30 | Token Merging for Fast Stable Diffusion | Daniel Bolya et.al. | 2303.17604v1 | link |
2023-03-30 | Zero-Shot Video Editing Using Off-The-Shelf Image Diffusion Models | Wen Wang et.al. | 2303.17599v1 | link |
2023-03-30 | Consistent View Synthesis with Pose-Guided Diffusion Models | Hung-Yu Tseng et.al. | 2303.17598v1 | null |
2023-03-30 | Forget-Me-Not: Learning to Forget in Text-to-Image Diffusion Models | Eric Zhang et.al. | 2303.17591v1 | link |
2023-03-30 | DDP: Diffusion Model for Dense Visual Prediction | Yuanfeng Ji et.al. | 2303.17559v1 | link |
2023-03-30 | DAE-Talker: High Fidelity Speech-Driven Talking Face Generation with Diffusion Autoencoder | Chenpng Du et.al. | 2303.17550v1 | null |
2023-03-30 | PAIR-Diffusion: Object-Level Image Editing with Structure-and-Appearance Paired Diffusion Models | Vidit Goel et.al. | 2303.17546v1 | link |
2023-03-29 | Physics-Driven Diffusion Models for Impact Sound Synthesis from Videos | Kun Su et.al. | 2303.16897v1 | null |
2023-03-30 | MDP: A Generalized Framework for Text-Guided Image Editing by Manipulating the Diffusion Path | Qian Wang et.al. | 2303.16765v2 | link |
2023-03-29 | 4D Facial Expression Diffusion Model | Kaifeng Zou et.al. | 2303.16611v1 | link |
2023-03-29 | WordStylist: Styled Verbatim Handwritten Text Generation with Latent Diffusion Models | Konstantina Nikolaidou et.al. | 2303.16576v1 | link |
2023-03-29 | Your Diffusion Model is Secretly a Zero-Shot Classifier | Alexander C. Li et.al. | 2303.16203v2 | link |
2023-03-28 | Visual Chain-of-Thought Diffusion Models | William Harvey et.al. | 2303.16187v1 | link |
2023-03-28 | Diffusion Maps for Group-Invariant Manifolds | Paulina Hoyos et.al. | 2303.16169v1 | null |
2023-03-28 | Novel View Synthesis of Humans using Differentiable Rendering | Guillaume Rochette et.al. | 2303.15880v1 | link |
2023-03-27 | The Stable Signature: Rooting Watermarks in Latent Diffusion Models | Pierre Fernandez et.al. | 2303.15435v1 | link |
2023-03-27 | Anti-DreamBooth: Protecting users from personalized text-to-image synthesis | Thanh Van Le et.al. | 2303.15433v1 | link |
2023-03-27 | Debiasing Scores and Prompts of 2D Diffusion for Robust Text-to-3D Generation | Susung Hong et.al. | 2303.15413v1 | link |
2023-03-27 | Training-free Style Transfer Emerges from h-space in Diffusion models | Jaeseok Jeong et.al. | 2303.15403v1 | null |
2023-03-27 | Exploring Continual Learning of Diffusion Models | Michał Zając et.al. | 2303.15342v1 | null |
2023-03-27 | Diffusion Models for Memory-efficient Processing of 3D Medical Images | Florentin Bieder et.al. | 2303.15288v1 | link |
2023-03-27 | Text-to-Image Diffusion Models are Zero-Shot Classifiers | Kevin Clark et.al. | 2303.15233v1 | null |
2023-03-24 | Make-It-3D: High-Fidelity 3D Creation from A Single Image with Diffusion Prior | Junshu Tang et.al. | 2303.14184v1 | link |
2023-03-24 | MindDiffuser: Controlled Image Reconstruction from Human Brain Activity with Semantic and Structural Diffusion | Yizhuo Lu et.al. | 2303.14139v1 | null |
2023-03-24 | CIFAKE: Image Classification and Explainable Identification of AI-Generated Synthetic Images | Jordan J. Bird et.al. | 2303.14126v1 | null |
2023-03-24 | Electron transport measurements in liquid xenon with Xenoscope, a large-scale DARWIN demonstrator | L. Baudis et.al. | 2303.13963v1 | null |
2023-03-23 | Ablating Concepts in Text-to-Image Diffusion Models | Nupur Kumari et.al. | 2303.13516v1 | link |
2023-03-23 | ReVersion: Diffusion-Based Relation Inversion from Images | Ziqi Huang et.al. | 2303.13495v1 | link |
2023-03-23 | Scaling laws of two-dimensional incompressible turbulent transport | D. I. Palade et.al. | 2303.13457v1 | null |
2023-03-23 | Text2Video-Zero: Text-to-Image Diffusion Models are Zero-Shot Video Generators | Levon Khachatryan et.al. | 2303.13439v1 | link |
2023-03-23 | Medical diffusion on a budget: textual inversion for medical image generation | Bram de Wilde et.al. | 2303.13430v1 | null |
2023-03-23 | DDT: A Diffusion-Driven Transformer-based Framework for Human Mesh Recovery from a Video | Ce Zheng et.al. | 2303.13397v1 | null |
2023-03-23 | Audio Diffusion Model for Speech Synthesis: A Survey on Text To Speech and Speech Enhancement in Generative AI | Chenshuang Zhang et.al. | 2303.13336v1 | null |
2023-03-23 | Decentralized Adversarial Training over Graphs | Ying Cao et.al. | 2303.13326v1 | null |
2023-03-23 | Fourier Diffusion Models: A Method to Control MTF and NPS in Score-Based Stochastic Image Generation | Matthew Tivnan et.al. | 2303.13285v1 | null |
2023-03-22 | Diffuse-Denoise-Count: Accurate Crowd-Counting with Diffusion Models | Yasiru Ranasinghe et.al. | 2303.12790v1 | link |
2023-03-22 | Instruct-NeRF2NeRF: Editing 3D Scenes with Instructions | Ayaan Haque et.al. | 2303.12789v1 | null |
2023-03-22 | FeatureNeRF: Learning Generalizable NeRFs by Distilling Foundation Models | Jianglong Ye et.al. | 2303.12786v1 | null |
2023-03-22 | Effect of gamma radiation on electrical properties of diffusive memristor devices | D. P. Pattnaik et.al. | 2303.12762v1 | null |
2023-03-22 | Pix2Video: Video Editing using Image Diffusion | Duygu Ceylan et.al. | 2303.12688v1 | link |
2023-03-23 | Feature-Conditioned Cascaded Video Diffusion Models for Precise Echocardiogram Synthesis | Hadrien Reynaud et.al. | 2303.12644v2 | link |
2023-03-22 | A Perceptual Quality Assessment Exploration for AIGC Images | Zicheng Zhang et.al. | 2303.12618v1 | null |
2023-03-21 | Vox-E: Text-guided Voxel Editing of 3D Objects | Etai Sella et.al. | 2303.12048v1 | link |
2023-03-21 | Semantic Latent Space Regression of Diffusion Autoencoders for Vertebral Fracture Grading | Matthias Keicher et.al. | 2303.12031v1 | null |
2023-03-21 | Numerical simulation of self-oscillating catalytic reaction in a plug-flow reactor | N. V. Peskov et.al. | 2303.12022v1 | null |
2023-03-21 | 3D-CLFusion: Fast Text-to-3D Rendering with Contrastive Latent Diffusion | Yu-Jhe Li et.al. | 2303.11938v1 | null |
2023-03-21 | CompoDiff: Versatile Composed Image Retrieval With Latent Diffusion | Geonmo Gu et.al. | 2303.11916v1 | link |
2023-03-21 | Projections of Model Spaces for Latent Graph Inference | Haitz Sáez de Ocáriz Borde et.al. | 2303.11754v1 | null |
2023-03-20 | Zero-1-to-3: Zero-shot One Image to 3D Object | Ruoshi Liu et.al. | 2303.11328v1 | link |
2023-03-20 | Localizing Object-level Shape Variations with Text-to-Image Diffusion Models | Or Patashnik et.al. | 2303.11306v1 | null |
2023-03-20 | SVDiff: Compact Parameter Space for Diffusion Fine-Tuning | Ligong Han et.al. | 2303.11305v1 | link |
2023-03-20 | AnimeDiffusion: Anime Face Line Drawing Colorization via Diffusion Models | Yu Cao et.al. | 2303.11137v1 | link |
2023-03-17 | A Recipe for Watermarking Diffusion Models | Yunqing Zhao et.al. | 2303.10137v1 | link |
2023-03-17 | Data-Centric Learning from Unlabeled Graphs with Diffusion Model | Gang Liu et.al. | 2303.10108v1 | link |
2023-03-17 | DialogPaint: A Dialog-based Image Editing Model | Jingxuan Wei et.al. | 2303.10073v1 | null |
2023-03-17 | GlueGen: Plug and Play Multi-modal Encoders for X-to-image Generation | Can Qin et.al. | 2303.10056v1 | link |
2023-03-17 | On the momentum diffusion over multiphase surfaces with meshless methods | Johannes C. Joubert et.al. | 2303.09978v1 | null |
2023-03-17 | Adversarial Counterfactual Visual Explanations | Guillaume Jeanneret et.al. | 2303.09962v1 | link |
2023-03-17 | Discovering mesoscopic descriptions of collective movement with neural stochastic modelling | Utkarsh Pratiush et.al. | 2303.09906v1 | link |
2023-03-16 | Efficient Diffusion Training via Min-SNR Weighting Strategy | Tiankai Hang et.al. | 2303.09556v1 | link |
2023-03-16 | Diffusion-HPC: Generating Synthetic Images with Realistic Humans | Zhenzhen Weng et.al. | 2303.09541v1 | link |
2023-03-17 | FateZero: Fusing Attentions for Zero-shot Text-based Video Editing | Chenyang Qi et.al. | 2303.09535v2 | link |
2023-03-16 | $P+$ : Extended Textual Conditioning in Text-to-Image Generation | Andrey Voynov et.al. | 2303.09522v1 | null |
2023-03-16 | DiffIR: Efficient Diffusion Model for Image Restoration | Bin Xia et.al. | 2303.09472v1 | link |
2023-03-16 | Unwrapping NPT simulations to calculate diffusion coefficients | Jakob Tómas Bullerjahn et.al. | 2303.09418v1 | null |
2023-03-17 | DINAR: Diffusion Inpainting of Neural Textures for One-Shot Human Avatars | David Svitov et.al. | 2303.09375v2 | link |
2023-03-15 | Stochastic Interpolants: A Unifying Framework for Flows and Diffusions | Michael S. Albergo et.al. | 2303.08797v1 | null |
2023-03-15 | Highly Personalized Text Embedding for Image Manipulation by Stable Diffusion | Inhwa Han et.al. | 2303.08767v1 | null |
2023-03-15 | Advanced Analysis of Radar Cross-Section Measurements in Reverberation Environment | Corentin Charlo et.al. | 2303.08751v1 | null |
2023-03-15 | DiffusionAD: Denoising Diffusion for Anomaly Detection | Hui Zhang et.al. | 2303.08730v1 | link |
2023-03-16 | ResDiff: Combining CNN and Diffusion Model for Image Super-Resolution | Shuyao Shang et.al. | 2303.08714v2 | null |
2023-03-15 | Zero-Shot Contrastive Loss for Text-Guided Diffusion Image Style Transfer | Serin Yang et.al. | 2303.08622v1 | link |
2023-03-14 | LayoutDM: Discrete Diffusion Model for Controllable Layout Generation | Naoto Inoue et.al. | 2303.08137v1 | link |
2023-03-14 | MeshDiffusion: Score-based Generative 3D Mesh Modeling | Zhen Liu et.al. | 2303.08133v1 | link |
2023-03-14 | Editing Implicit Assumptions in Text-to-Image Diffusion Models | Hadas Orgad et.al. | 2303.08084v1 | link |
2023-03-15 | Interpretable ODE-style Generative Diffusion Model via Force Field Construction | Weiyang Jin et.al. | 2303.08063v2 | null |
2023-03-14 | Edit-A-Video: Single Video Editing with Object-Aware Consistency | Chaehun Shin et.al. | 2303.07945v1 | null |
2023-03-15 | Controllable Mesh Generation Through Sparse Latent Point Diffusion Models | Zhaoyang Lyu et.al. | 2303.07938v2 | null |
2023-03-15 | Let 2D Diffusion Model Know 3D-Consistency for Robust Text-to-3D Generation | Junyoung Seo et.al. | 2303.07937v2 | link |
2023-03-13 | Erasing Concepts from Diffusion Models | Rohit Gandikota et.al. | 2303.07345v1 | link |
2023-03-14 | Parallel Vertex Diffusion for Unified Visual Grounding | Zesen Cheng et.al. | 2303.07216v2 | null |
2023-03-10 | GECCO: Geometrically-Conditioned Point Diffusion Models | Michał J. Tyszkiewicz et.al. | 2303.05916v1 | null |
2023-03-10 | Photon Diffusion in Microscale Solids | Avijit Das et.al. | 2303.05776v1 | null |
2023-03-10 | TrojDiff: Trojan Attacks on Diffusion Models with Diverse Targets | Weixin Chen et.al. | 2303.05762v1 | link |
2023-03-10 | Fast Diffusion Sampler for Inverse Problems by Geometric Decomposition | Hyungjin Chung et.al. | 2303.05754v1 | link |
2023-03-09 | Scaling up GANs for Text-to-Image Synthesis | Minguk Kang et.al. | 2303.05511v1 | null |
2023-03-09 | Resolving quantitative MRI model degeneracy with machine learning via training data distribution design | Michele Guerreri et.al. | 2303.05464v1 | null |
2023-03-09 | 3DGen: Triplane Latent Diffusion for Textured Mesh Generation | Anchit Gupta et.al. | 2303.05371v1 | null |
2023-03-09 | TGDataset: a Collection of Over One Hundred Thousand Telegram Channels | Massimo La Morgia et.al. | 2303.05345v1 | link |
2023-03-09 | Brain-Diffuser: Natural scene reconstruction from fMRI signals using generative latent diffusion | Furkan Ozcelik et.al. | 2303.05334v1 | link |
2023-03-08 | Open-Vocabulary Panoptic Segmentation with Text-to-Image Diffusion Models | Jiarui Xu et.al. | 2303.04803v1 | link |
2023-03-08 | Multilevel Diffusion: Infinite Dimensional Score-Based Diffusion Models for Image Generation | Paul Hagemann et.al. | 2303.04772v1 | link |
2023-03-08 | Video-P2P: Video Editing with Cross-attention Control | Shaoteng Liu et.al. | 2303.04761v1 | null |
2023-03-08 | Visual ChatGPT: Talking, Drawing and Editing with Visual Foundation Models | Chenfei Wu et.al. | 2303.04671v1 | link |
2023-03-08 | Diffusing Gaussian Mixtures for Generating Categorical Data | Florence Regol et.al. | 2303.04635v1 | link |
2023-03-08 | Connecting finite-time Lyapunov exponents with supersaturation and droplet dynamics in the bulk of a turbulent cloud | Vladyslav Pushenko et.al. | 2303.04632v1 | null |
2023-03-08 | Maritime transportation and people mobility in the early diffusion of COVID-19 in Croatia | Corentin Cot et.al. | 2303.04617v1 | null |
2023-03-07 | Diffusion Policy: Visuomotor Policy Learning via Action Diffusion | Cheng Chi et.al. | 2303.04137v1 | null |
2023-03-06 | Restoration-Degradation Beyond Linear Diffusions: A Non-Asymptotic Analysis For DDIM-Type Samplers | Sitan Chen et.al. | 2303.03384v1 | null |
2023-03-06 | StyO: Stylize Your Face in Only One-Shot | Bonan Li et.al. | 2303.03231v1 | null |
2023-03-03 | Unleashing Text-to-Image Diffusion Models for Visual Perception | Wenliang Zhao et.al. | 2303.02153v1 | link |
2023-03-03 | Multi-Agent Adversarial Training Using Diffusion Learning | Ying Cao et.al. | 2303.01936v1 | null |
2023-03-03 | CONTAIN: A Community-based Algorithm for Network Immunization | Özgur Coban et.al. | 2303.01934v1 | link |
2023-03-02 | Consistency Models | Yang Song et.al. | 2303.01469v1 | link |
2023-03-02 | Human Motion Diffusion as a Generative Prior | Yonatan Shafir et.al. | 2303.01418v1 | link |
2023-03-02 | Why (and When) does Local SGD Generalize Better than SGD? | Xinran Gu et.al. | 2303.01215v1 | link |
2023-03-01 | StraIT: Non-autoregressive Generation with Stratified Image Transformer | Shengju Qian et.al. | 2303.00750v1 | null |
2023-03-01 | Diffusing Graph Attention | Daniel Glickman et.al. | 2303.00613v1 | null |
2023-03-01 | Level Up the Deepfake Detection: a Method to Effectively Discriminate Images Generated by GAN Architectures and Diffusion Models | Luca Guarnera et.al. | 2303.00608v1 | null |
2023-03-01 | Unlimited-Size Diffusion Restoration | Yinhuai Wang et.al. | 2303.00354v1 | link |
2023-03-01 | Collage Diffusion | Vishnu Sarukkai et.al. | 2303.00262v1 | null |
2023-03-01 | Diffusion Probabilistic Fields | Peiye Zhuang et.al. | 2303.00165v1 | null |
2023-02-28 | Phase Field Modeling of Dictyostelium Discoideum Chemotaxis | Yunsong Zhang et.al. | 2302.14854v1 | null |
2023-02-28 | Monocular Depth Estimation using Diffusion Models | Saurabh Saxena et.al. | 2302.14816v1 | null |
2023-02-28 | Dissolving Is Amplifying: Towards Fine-Grained Anomaly Detection | Jian Shi et.al. | 2302.14696v1 | link |
2023-02-28 | Synthesizing Mixed-type Electronic Health Records using Diffusion Models | Taha Ceritli et.al. | 2302.14679v1 | null |
2023-02-28 | Detecting and Optimising Team Interactions in Software Development | Christian Zingg et.al. | 2302.14609v1 | null |
2023-02-28 | Can We Use Diffusion Probabilistic Models for 3D Motion Prediction? | Hyemin Ahn et.al. | 2302.14503v1 | null |
2023-02-27 | Buoyancy-driven attraction of active droplets | Yibo Chen et.al. | 2302.14008v1 | null |
2023-02-27 | Impact of reconstruction schemes on interpreting lattice Boltzmann results – A study using the Taylor-Green vortex problem | Jianping Meng et.al. | 2302.13910v1 | null |
2023-02-27 | Differentially Private Diffusion Models Generate Useful Synthetic Images | Sahra Ghalebikesabi et.al. | 2302.13861v1 | null |
2023-02-27 | Denoising Diffusion Samplers | Francisco Vargas et.al. | 2302.13834v1 | null |
2023-02-24 | Modulating Pretrained Diffusion Models for Multimodal Image Synthesis | Cusuh Ham et.al. | 2302.12764v1 | null |
2023-02-24 | Physical interactions promote Turing patterns | Lucas Menou et.al. | 2302.12521v1 | null |
2023-02-24 | Flow instability and momentum exchange in separation control by a synthetic jet | Yoshiaki Abe et.al. | 2302.12496v1 | null |
2023-02-24 | Unsupervised Discovery of Semantic Latent Directions in Diffusion Models | Yong-Hyun Park et.al. | 2302.12469v1 | null |
2023-02-23 | To the Noise and Back: Diffusion for Shared Autonomy | Takuma Yoneda et.al. | 2302.12244v1 | null |
2023-02-23 | DiffusioNeRF: Regularizing Neural Radiance Fields with Denoising Diffusion Models | Jamie Wynn et.al. | 2302.12231v1 | link |
2023-02-23 | Designing an Encoder for Fast Personalization of Text-to-Image Models | Rinon Gal et.al. | 2302.12228v1 | null |
2023-02-23 | Metric-oriented Speech Enhancement using Diffusion Probabilistic Model | Chen Chen et.al. | 2302.11989v1 | null |
2023-02-22 | Uncovering Bias in Face Generation Models | Cristian Muñoz et.al. | 2302.11562v1 | null |
2023-02-22 | Reduce, Reuse, Recycle: Compositional Generation with Energy-Based Diffusion Models and MCMC | Yilun Du et.al. | 2302.11552v1 | link |
2023-02-22 | Scaling Robot Learning with Semantically Imagined Experience | Tianhe Yu et.al. | 2302.11550v1 | null |
2023-02-22 | Aligned Diffusion Schrödinger Bridges | Vignesh Ram Somnath et.al. | 2302.11419v1 | link |
2023-02-22 | Entity-Level Text-Guided Image Manipulation | Yikai Wang et.al. | 2302.11383v1 | link |
2023-02-22 | An agent-based model of the 2020 international policy diffusion in response to the COVID-19 pandemic with particle filter | Yannick Oswald et.al. | 2302.11277v1 | link |
2023-02-21 | Provable Copyright Protection for Generative Models | Nikhil Vyas et.al. | 2302.10870v1 | null |
2023-02-21 | Learning 3D Photography Videos via Self-supervised Diffusion on Single Images | Xiaodong Wang et.al. | 2302.10781v1 | null |
2023-02-21 | On Calibrating Diffusion Probabilistic Models | Tianyu Pang et.al. | 2302.10688v1 | link |
2023-02-21 | $PC^2$ : Projection-Conditioned Point Cloud Diffusion for Single-Image 3D Reconstruction | Luke Melas-Kyriazi et.al. | 2302.10668v1 | link |
2023-02-21 | RealFusion: 360° Reconstruction of Any Object from a Single Image | Luke Melas-Kyriazi et.al. | 2302.10663v1 | null |
2023-02-21 | Diffusion Models and Semi-Supervised Learners Benefit Mutually with Few Labels | Zebin You et.al. | 2302.10586v1 | link |
2023-02-20 | Towards Universal Fake Image Detectors that Generalize Across Generative Models | Utkarsh Ojha et.al. | 2302.10174v1 | link |
2023-02-20 | Cross-domain Compositing with Pretrained Diffusion Models | Roy Hachnochi et.al. | 2302.10167v1 | link |
2023-02-20 | NerfDiff: Single-image View Synthesis with NeRF-guided Distillation from 3D-aware Diffusion | Jiatao Gu et.al. | 2302.10109v1 | null |
2023-02-20 | DINOISER: Diffused Conditional Sequence Learning by Manipulating Noises | Jiasheng Ye et.al. | 2302.10025v1 | link |
2023-02-17 | Consistent Diffusion Models: Mitigating Sampling Drift by Learning to be Consistent | Giannis Daras et.al. | 2302.09057v1 | link |
2023-02-17 | MiDi: Mixed Graph and 3D Denoising Diffusion for Molecule Generation | Clement Vignac et.al. | 2302.09048v1 | link |
2023-02-17 | LDFA: Latent Diffusion Face Anonymization for Self-driving Applications | Marvin Klemp et.al. | 2302.08931v1 | null |
2023-02-17 | Multi-unit Auction over a Social Network | Yuan Fang et.al. | 2302.08924v1 | null |
2023-02-17 | Unraveling the Variations of the Society of England and Wales through Diffusion Maps Analysis on Census 2011 | Gezhi Xiu et.al. | 2302.08701v1 | null |
2023-02-16 | Text-driven Visual Synthesis with Latent Diffusion Prior | Ting-Hsuan Liao et.al. | 2302.08510v1 | null |
2023-02-16 | T2I-Adapter: Learning Adapters to Dig out More Controllable Ability for Text-to-Image Diffusion Models | Chong Mou et.al. | 2302.08453v1 | link |
2023-02-16 | Explicit Diffusion of Gaussian Mixture Model Based Image Priors | Martin Zach et.al. | 2302.08411v1 | null |
2023-02-16 | Boundary Guided Mixing Trajectory for Semantic Control with Diffusion Models | Ye Zhu et.al. | 2302.08357v1 | link |
2023-02-15 | Dataset Interfaces: Diagnosing Model Failures Using Controllable Counterfactual Generation | Joshua Vendrow et.al. | 2302.07865v1 | link |
2023-02-15 | Denoising Diffusion Probabilistic Models for Robust Image Super-Resolution in the Wild | Hshmat Sahak et.al. | 2302.07864v1 | null |
2023-02-15 | Data Forensics in Diffusion Models: A Systematic Analysis of Membership Privacy | Derui Zhu et.al. | 2302.07801v1 | null |
2023-02-15 | Video Probabilistic Diffusion Models in Projected Latent Space | Sihyun Yu et.al. | 2302.07685v1 | null |
2023-02-14 | Where to Diffuse, How to Diffuse, and How to Get Back: Automated Learning for Multivariate Diffusions | Raghav Singhal et.al. | 2302.07261v1 | null |
2023-02-14 | Score Approximation, Estimation and Distribution Recovery of Diffusion Models on Low-Dimensional Data | Minshuo Chen et.al. | 2302.07194v1 | null |
2023-02-14 | Universal Guidance for Diffusion Models | Arpit Bansal et.al. | 2302.07121v1 | link |
2023-02-14 | Differential privacy diffusion auction of homogeneous items | Fengjuan Jia et.al. | 2302.07072v1 | null |
2023-02-14 | Direct numerical simulations of the Taylor-Green Vortex interacting with a hydrogen diffusion flame: Reynolds number and non-unity Lewis number effects | Yifan Xu et.al. | 2302.07006v1 | null |
2023-02-13 | Raising the Cost of Malicious AI-Powered Image Editing | Hadi Salman et.al. | 2302.06588v1 | link |
2023-02-13 | Preconditioned Score-based Generative Models | Li Zhang et.al. | 2302.06504v1 | link |
2023-02-13 | Technical Note: PDE-constrained Optimization Formulation for Tumor Growth Model Calibration | Baoshan Liang et.al. | 2302.06445v1 | null |
2023-02-13 | ContrasInver: Voxel-wise Contrastive Semi-supervised Learning for Seismic Inversion | Yimin Dou et.al. | 2302.06441v1 | null |
2023-02-13 | Interplay between advective, diffusive, and active barriers in Rayleigh-Bénard flow | Nikolas Aksamit et.al. | 2302.06319v1 | null |
2023-02-10 | Example-Based Sampling with Diffusion Models | Bastien Doignies et.al. | 2302.05116v1 | null |
2023-02-09 | UniPC: A Unified Predictor-Corrector Framework for Fast Sampling of Diffusion Models | Wenliang Zhao et.al. | 2302.04867v1 | link |
2023-02-09 | RelightableHands: Efficient Neural Relighting of Articulated Hand Models | Shun Iwase et.al. | 2302.04866v1 | null |
2023-02-09 | Is This Loss Informative? Speeding Up Textual Inversion with Deterministic Objective Evaluation | Anton Voronov et.al. | 2302.04841v1 | link |
2023-02-09 | Better Diffusion Models Further Improve Adversarial Training | Zekai Wang et.al. | 2302.04638v1 | link |
2023-02-09 | Adversarial Example Does Good: Preventing Painting Imitation from Diffusion Models via Adversarial Examples | Chumeng Liang et.al. | 2302.04578v1 | link |
2023-02-08 | PFGM++: Unlocking the Potential of Physics-Inspired Generative Models | Yilun Xu et.al. | 2302.04265v1 | link |
2023-02-08 | GLAZE: Protecting Artists from Style Mimicry by Text-to-Image Models | Shawn Shan et.al. | 2302.04222v1 | null |
2023-02-08 | Policy Evaluation in Decentralized POMDPs with Belief Sharing | Mert Kayaalp et.al. | 2302.04151v1 | link |
2023-02-08 | Dimensional lattice Boltzmann method for transport phenomena simulation without conversion to lattice units | Ivan Talão Martins et.al. | 2302.04120v1 | null |
2023-02-07 | Long Horizon Temperature Scaling | Andy Shih et.al. | 2302.03686v1 | link |
2023-02-07 | Hard Prompts Made Easy: Gradient-Based Discrete Optimization for Prompt Tuning and Discovery | Yuxin Wen et.al. | 2302.03668v1 | link |
2023-02-07 | HumanMAC: Masked Motion Completion for Human Motion Prediction | Ling-Hao Chen et.al. | 2302.03665v1 | link |
2023-02-07 | Graph Generation with Destination-Driven Diffusion Mixture | Jaehyeong Jo et.al. | 2302.03596v1 | link |
2023-02-06 | Zero-shot Image-to-Image Translation | Gaurav Parmar et.al. | 2302.03027v1 | link |
2023-02-06 | Structure and Content-Guided Video Synthesis with Diffusion Models | Patrick Esser et.al. | 2302.03011v1 | null |
2023-02-03 | AdaptDiffuser: Diffusion Models as Adaptive Self-evolving Planners | Zhixuan Liang et.al. | 2302.01877v1 | link |
2023-02-03 | TEXTure: Text-Guided Texturing of 3D Shapes | Elad Richardson et.al. | 2302.01721v1 | link |
2023-02-03 | Learning End-to-End Channel Coding with Diffusion Models | Muah Kim et.al. | 2302.01714v1 | null |
2023-02-03 | A Lipschitz Bandits Approach for Continuous Hyperparameter Optimization | Yasong Feng et.al. | 2302.01539v1 | null |
2023-02-02 | Dreamix: Video Diffusion Models are General Video Editors | Eyal Molad et.al. | 2302.01329v1 | null |
2023-02-02 | Are Diffusion Models Vulnerable to Membership Inference Attacks? | Jinhao Duan et.al. | 2302.01316v1 | link |
2023-02-01 | Stable Target Field for Reduced Variance Score Estimation in Diffusion Models | Yilun Xu et.al. | 2302.00670v1 | link |
2023-01-31 | Attend-and-Excite: Attention-Based Semantic Guidance for Text-to-Image Diffusion Models | Hila Chefer et.al. | 2301.13826v1 | link |
2023-01-30 | Extracting Training Data from Diffusion Models | Nicholas Carlini et.al. | 2301.13188v1 | null |
2023-01-30 | Shape-aware Text-driven Layered Video Editing | Yao-Chih Lee et.al. | 2301.13173v1 | null |
2023-01-30 | GALIP: Generative Adversarial CLIPs for Text-to-Image Synthesis | Ming Tao et.al. | 2301.12959v1 | link |
2023-01-30 | ERA-Solver: Error-Robust Adams Solver for Fast Sampling of Diffusion Probabilistic Models | Shengmeng Li et.al. | 2301.12935v1 | null |
2023-01-30 | PromptMix: Text-to-image diffusion models enhance the performance of lightweight networks | Arian Bakhtiarnia et.al. | 2301.12914v1 | null |
2023-01-27 | Moûsai: Text-to-Music Generation with Long-Context Latent Diffusion | Flavio Schneider et.al. | 2301.11757v1 | link |
2023-01-27 | Diffusion Models as Artists: Are we Closing the Gap between Humans and Machines? | Victor Boutin et.al. | 2301.11722v1 | link |
2023-01-26 | simple diffusion: End-to-end diffusion for high resolution images | Emiel Hoogeboom et.al. | 2301.11093v1 | null |
2023-01-26 | On the Importance of Noise Scheduling for Diffusion Models | Ting Chen et.al. | 2301.10972v1 | null |
2023-01-25 | Imitating Human Behaviour with Diffusion Models | Tim Pearce et.al. | 2301.10677v1 | link |
2023-01-24 | Bipartite Graph Diffusion Model for Human Interaction Generation | Baptiste Chopin et.al. | 2301.10134v1 | link |
2023-01-24 | DiffMotion: Speech-Driven Gesture Synthesis Using Denoising Diffusion Model | Fan Zhang et.al. | 2301.10047v1 | link |
2023-01-24 | Membership Inference of Diffusion Models | Hailong Hu et.al. | 2301.09956v1 | link |
2023-01-23 | LEGO-Net: Learning Regular Rearrangements of Objects in Rooms | Qiuhong Anna Wei et.al. | 2301.09629v1 | null |
2023-01-23 | Evaluation of Light Collection from Highly Scattering Media using Wavelength-Shifting Fibers | Andrew Wilhelm et.al. | 2301.09608v1 | null |
2023-01-23 | StyleGAN-T: Unlocking the Power of GANs for Fast Large-Scale Text-to-Image Synthesis | Axel Sauer et.al. | 2301.09515v1 | link |
2023-01-23 | DIFFormer: Scalable (Graph) Transformers Induced by Energy Constrained Diffusion | Qitian Wu et.al. | 2301.09474v1 | link |
2023-01-19 | Dif-Fusion: Towards High Color Fidelity in Infrared and Visible Image Fusion with Diffusion Models | Jun Yue et.al. | 2301.08072v1 | null |
2023-01-18 | Targeted Image Reconstruction by Sampling Pre-trained Diffusion Model | Jiageng Zheng et.al. | 2301.07557v1 | null |
2023-01-17 | GLIGEN: Open-Set Grounded Text-to-Image Generation | Yuheng Li et.al. | 2301.07093v1 | link |
2023-01-13 | In BLOOM: Creativity and Affinity in Artificial Lyrics and Art | Evan Crothers et.al. | 2301.05402v1 | link |
2023-01-12 | Guiding Text-to-Image Diffusion Model Towards Grounded Generation | Ziyi Li et.al. | 2301.05221v1 | null |
2023-01-12 | Thompson Sampling with Diffusion Generative Prior | Yu-Guan Hsieh et.al. | 2301.05182v1 | null |
(<a href=#Updated-on-20240404>back to top</a>)
Publish Date | Title | Authors | Code | |
---|---|---|---|---|
2024-04-02 | Sketch3D: Style-Consistent Guidance for Sketch-to-3D Generation | Wangguandong Zheng et.al. | 2404.01843v1 | null |
2024-04-02 | FashionEngine: Interactive Generation and Editing of 3D Clothed Humans | Tao Hu et.al. | 2404.01655v1 | null |
2024-04-01 | Categorical semiotics: Foundations for Knowledge Integration | Carlos Leandro et.al. | 2404.01526v1 | null |
2024-04-01 | Can Biases in ImageNet Models Explain Generalization? | Paul Gavrikov et.al. | 2404.01509v1 | link |
2024-04-02 | GDA: Generalized Diffusion for Robust Test-time Adaptation | Yun-Yun Tsai et.al. | 2404.00095v2 | null |
2024-03-29 | Optimal Communication for Classic Functions in the Coordinator Model and Beyond | Hossein Esfandiari et.al. | 2403.20307v1 | null |
2024-03-29 | Sketch-to-Architecture: Generative AI-aided Architectural Design | Pengzhi Li et.al. | 2403.20186v1 | null |
2024-03-28 | Dealing with Missing Modalities in Multimodal Recommendation: a Feature Propagation-based Approach | Daniele Malitesta et.al. | 2403.19841v1 | null |
2024-03-28 | TASR: A Novel Trust-Aware Stackelberg Routing Algorithm to Mitigate Traffic Congestion | Doris E. M. Brown et.al. | 2403.19831v1 | null |
2024-03-26 | Neural Attributed Community Search at Billion Scale | Jianwei Wang et.al. | 2403.18874v1 | null |
2024-03-27 | A Path Towards Legal Autonomy: An interoperable and explainable approach to extracting, transforming, loading and computing legal information using large language models, expert systems and Bayesian networks | Axel Constant et.al. | 2403.18537v1 | null |
2024-03-27 | U-Sketch: An Efficient Approach for Sketch to Image Diffusion Models | Ilias Mitsouras et.al. | 2403.18425v1 | null |
2024-03-27 | ECNet: Effective Controllable Text-to-Image Diffusion Models | Sicheng Li et.al. | 2403.18417v1 | null |
2024-03-26 | Search and Society: Reimagining Information Access for Radical Futures | Bhaskar Mitra et.al. | 2403.17901v1 | null |
2024-03-26 | ExpressEdit: Video Editing with Natural Language and Sketching | Bekzat Tilekbay et.al. | 2403.17693v1 | null |
2024-03-26 | Equipping Sketch Patches with Context-Aware Positional Encoding for Graphic Sketch Representation | Sicong Zang et.al. | 2403.17525v1 | null |
2024-03-25 | On Policy Reuse: An Expressive Language for Representing and Executing General Policies that Call Other Policies | Blai Bonet et.al. | 2403.16824v1 | null |
2024-03-25 | CodeS: Natural Language to Code Repository via Multi-Layer Sketch | Daoguang Zan et.al. | 2403.16443v1 | link |
2024-03-24 | Combined Task and Motion Planning Via Sketch Decompositions (Extended Version with Supplementary Material) | Magí Dalmau-Moreno et.al. | 2403.16277v1 | null |
2024-03-22 | Efficiently Estimating Mutual Information Between Attributes Across Tables | Aécio Santos et.al. | 2403.15553v1 | null |
2024-03-22 | Fourier Transform-based Estimators for Data Sketches | Seth Pettie et.al. | 2403.15366v1 | null |
2024-03-25 | Multimodal-Conditioned Latent Diffusion Models for Fashion Image Editing | Alberto Baldrati et.al. | 2403.14828v2 | link |
2024-03-21 | Object-Centric Domain Randomization for 3D Shape Reconstruction in the Wild | Junhyeong Cho et.al. | 2403.14539v1 | null |
2024-03-21 | External Knowledge Enhanced 3D Scene Generation from Sketch | Zijie Wu et.al. | 2403.14121v1 | null |
2024-03-20 | Towards an extension of Fault Trees in the Predictive Maintenance Scenario | Roberta De Fazio et.al. | 2403.13785v1 | null |
2024-03-25 | Diagrammatic Instructions to Specify Spatial Objectives and Constraints with Applications to Mobile Base Placement | Qilin Sun et.al. | 2403.12465v2 | null |
2024-03-18 | Towards a Theory of Pragmatic Information | Edward D. Weinberger et.al. | 2403.12324v1 | null |
2024-03-17 | Stylized Face Sketch Extraction via Generative Prior with Limited Data | Kwan Yun et.al. | 2403.11263v1 | link |
2024-03-16 | RETINAQA : A Knowledge Base Question Answering Model Robust to both Answerable and Unanswerable Questions | Prayushi Faldu et.al. | 2403.10849v1 | null |
2024-03-15 | Animate Your Motion: Turning Still Images into Dynamic Videos | Mingxiao Li et.al. | 2403.10179v1 | null |
2024-03-14 | What Sketch Explainability Really Means for Downstream Tasks | Hmrishav Bandyopadhyay et.al. | 2403.09480v1 | null |
2024-03-14 | SketchINR: A First Look into Sketches as Implicit Neural Representations | Hmrishav Bandyopadhyay et.al. | 2403.09344v1 | null |
2024-03-14 | Unlocking the conversion of Web Screenshots into HTML Code with the WebSight Dataset | Hugo Laurençon et.al. | 2403.09029v1 | null |
2024-03-13 | ARtVista: Gateway To Empower Anyone Into Artist | Trong-Vu Hoang et.al. | 2403.08876v1 | null |
2024-03-13 | HAIFIT: Human-Centered AI for Fashion Image Translation | Jianan Jiang et.al. | 2403.08651v1 | link |
2024-03-13 | Sketch2Manga: Shaded Manga Screening from Sketch with Diffusion Models | Jian Lin et.al. | 2403.08266v1 | null |
2024-03-12 | It’s All About Your Sketch: Democratising Sketch Control in Diffusion Models | Subhadeep Koley et.al. | 2403.07234v1 | link |
2024-03-12 | You’ll Never Walk Alone: A Sketch and Text Duet for Fine-Grained Image Retrieval | Subhadeep Koley et.al. | 2403.07222v1 | null |
2024-03-12 | Text-to-Image Diffusion Models are Great Sketch-Photo Matchmakers | Subhadeep Koley et.al. | 2403.07214v1 | null |
2024-03-11 | How to Handle Sketch-Abstraction in Sketch-Based Image Retrieval? | Subhadeep Koley et.al. | 2403.07203v1 | null |
2024-03-11 | Enhancing Image Caption Generation Using Reinforcement Learning with Human Feedback | Adarsh N L et.al. | 2403.06735v1 | null |
2024-03-08 | Data-Dependent LSH for the Earth Mover’s Distance | Rajesh Jayaram et.al. | 2403.05041v1 | null |
2024-03-07 | A challenge in A(G)I, cybernetics revived in the Ouroboros Model as one algorithm for all thinking | Knud Thomsen et.al. | 2403.04292v1 | null |
2024-03-06 | NoiseCollage: A Layout-Aware Text-to-Image Diffusion Model Based on Noise Cropping and Merging | Takahiro Shirakawa et.al. | 2403.03485v1 | link |
2024-03-07 | DLP-GAN: learning to draw modern Chinese landscape photos with generative adversarial network | Xiangquan Gui et.al. | 2403.03456v2 | null |
2024-03-05 | SmartSantander: IoT Experimentation over a Smart City Testbed | Luis Sanchez et.al. | 2403.03196v1 | null |
2024-03-05 | CoGenesis: A Framework Collaborating Large and Small Language Models for Secure Context-Aware Instruction Following | Kaiyan Zhang et.al. | 2403.03129v1 | null |
2024-03-05 | RT-Sketch: Goal-Conditioned Imitation Learning from Hand-Drawn Sketches | Priya Sundaresan et.al. | 2403.02709v1 | null |
2024-03-02 | Euclidean distance compression via deep random features | Brett Leroux et.al. | 2403.01327v1 | null |
2024-02-29 | CoMeT: Count-Min-Sketch-based Row Tracking to Mitigate RowHammer at Low Cost | F. Nisa Bostanci et.al. | 2402.18769v1 | link |
2024-02-28 | DynaWarp – Efficient, large-scale log storage and retrieval | Julian Reichinger et.al. | 2402.18355v1 | null |
2024-02-28 | Block and Detail: Scaffolding Sketch-to-Image Generation | Vishnu Sarukkai et.al. | 2402.18116v1 | null |
2024-02-27 | Decremental $(1+ε)$ -Approximate Maximum Eigenvector: Dynamic Power Method | Deeksha Adil et.al. | 2402.17929v1 | null |
2024-02-27 | Surgment: Segmentation-enabled Semantic Search and Creation of Visual Question and Feedback to Support Video-Based Surgery Learning | Jingying Wang et.al. | 2402.17903v1 | null |
2024-02-27 | CAD-SIGNet: CAD Language Inference from Point Clouds using Layer-wise Sketch Instance Guided Attention | Mohammad Sadil Khan et.al. | 2402.17678v1 | null |
2024-02-27 | CustomSketching: Sketch Concept Extraction for Sketch-based Image Synthesis and Editing | Chufeng Xiao et.al. | 2402.17624v1 | null |
2024-02-27 | Equivariant ideals of polynomials | Arka Ghosh et.al. | 2402.17604v1 | null |
2024-02-25 | Convolution and Cross-Correlation of Count Sketches Enables Fast Cardinality Estimation of Multi-Join Queries | Mike Heddes et.al. | 2402.15953v1 | link |
2024-02-23 | Genie: Generative Interactive Environments | Jake Bruce et.al. | 2402.15391v1 | null |
2024-02-22 | Semantic Image Synthesis with Unconditional Generator | Jungwoo Chae et.al. | 2402.14395v1 | null |
2024-02-21 | Sketching AI Concepts with Capabilities and Examples: AI Innovation in the Intensive Care Unit | Nur Yildirim et.al. | 2402.13437v1 | null |
2024-02-20 | Quantitative causality, causality-guided scientific discovery, and causal machine learning | X. San Liang et.al. | 2402.13427v1 | null |
2024-02-20 | Almost-Tight Bounds on Preserving Cuts in Classes of Submodular Hypergraphs | Sanjeev Khanna et.al. | 2402.13151v1 | null |
2024-02-17 | Be Persistent: Towards a Unified Solution for Mitigating Shortcuts in Deep Learning | Hadi M. Dolatabadi et.al. | 2402.11237v1 | null |
2024-02-17 | Automated Optimization of Parameterized Data-Plane Programs with Parasol | Mary Hogan et.al. | 2402.11155v1 | null |
2024-02-13 | Sampling Space-Saving Set Sketches | Homin K. Lee et.al. | 2402.08604v1 | link |
2024-02-13 | One-to-many Reconstruction of 3D Geometry of cultural Artifacts using a synthetically trained Generative Model | Thomas Pöllabauer et.al. | 2402.08310v1 | null |
2024-02-13 | Epistemic Power, Objectivity and Gender in AI Ethics Labor: Legitimizing Located Complaints | David Gray Widder et.al. | 2402.08171v1 | null |
2024-02-13 | Randomized Algorithms for Symmetric Nonnegative Matrix Factorization | Koby Hayashi et.al. | 2402.08134v1 | null |
2024-02-10 | Guided Sketch-Based Program Induction by Search Gradients | Ahmad Ayaz Amin et.al. | 2402.06990v1 | null |
2024-02-09 | Squidgets: Sketch-based Widget Design and Direct Manipulation of 3D Scene | Joonho Kim et.al. | 2402.06795v1 | null |
2024-02-08 | InkSight: Offline-to-Online Handwriting Conversion by Learning to Read and Write | Blagoj Mitrevski et.al. | 2402.05804v1 | null |
2024-02-08 | A Concept for Reconstructing Stucco Statues from historic Sketches using synthetic Data only | Thomas Pöllabauer et.al. | 2402.05593v1 | null |
2024-02-06 | Gradient Sketches for Training Data Attribution and Studying the Loss Landscape | Andrea Schioppa et.al. | 2402.03994v1 | null |
2024-02-06 | 3Doodle: Compact Abstraction of Objects with 3D Strokes | Changwoon Choi et.al. | 2402.03690v1 | null |
2024-02-05 | Computing Generic Fibres of Polynomial Ideals with FGLM and Hensel Lifting | Jérémy Berthomieu et.al. | 2402.03144v1 | null |
2024-02-03 | Zero-shot sketch-based remote sensing image retrieval based on multi-level and attention-guided tokenization | Bo Yang et.al. | 2402.02141v1 | link |
2024-02-02 | Solitons, dispersive shock waves and Noel Fredrick Smyth | Saleh Baqer et.al. | 2402.01332v1 | null |
2024-02-01 | Deep Robot Sketching: An application of Deep Q-Learning Networks for human-like sketching | Raul Fernandez-Fernandez et.al. | 2402.00676v1 | null |
2024-02-01 | High-Quality Medical Image Generation from Free-hand Sketch | Quan Huu Cap et.al. | 2402.00353v1 | null |
2024-01-31 | On The Power of Subtle Expressive Cues in the Perception of Human Affects | Ezgi Dede et.al. | 2401.18013v1 | null |
2024-02-04 | Fine-Grained Zero-Shot Learning: Advances, Challenges, and Prospects | Jingcai Guo et.al. | 2401.17766v2 | link |
2024-01-31 | Estimating Diffusion Degree on Graph Streams | Vinit Ramesh Gore et.al. | 2401.17611v1 | null |
2024-01-31 | Topology-Aware Latent Diffusion for 3D Shape Generation | Jiangbei Hu et.al. | 2401.17603v1 | null |
2024-01-29 | FPGA Technology Mapping Using Sketch-Guided Program Synthesis | Gus Henry Smith et.al. | 2401.16526v1 | null |
2024-01-29 | Bridging Generative and Discriminative Models for Unified Visual Perception with Diffusion Priors | Shiyin Dong et.al. | 2401.16459v1 | null |
2024-01-25 | Incremental Proof Development in Dafny with Module-Based Induction | Son Ho et.al. | 2401.16233v1 | null |
2024-01-26 | Sketch and Refine: Towards Fast and Accurate Lane Detection | Chao Chen et.al. | 2401.14729v1 | link |
2024-01-27 | Sketch2NeRF: Multi-view Sketch-guided Text-to-3D Generation | Minglin Chen et.al. | 2401.14257v2 | null |
2024-01-22 | PatternPortrait: Draw Me Like One of Your Scribbles | Sabine Wieluch et.al. | 2401.13001v1 | null |
2024-01-22 | Automated Completion of Statements and Proofs in Synthetic Geometry: an Approach based on Constraint Solving | Salwa Tabet Gonzalez et.al. | 2401.11898v1 | null |
2024-01-18 | Sketch-Guided Constrained Decoding for Boosting Blackbox Large Language Models without Logit Access | Saibo Geng et.al. | 2401.09967v1 | null |
2024-01-21 | Towards Identifiable Unsupervised Domain Translation: A Diversified Distribution Matching Approach | Sagar Shrestha et.al. | 2401.09671v2 | null |
2024-01-12 | Masked Attribute Description Embedding for Cloth-Changing Person Re-identification | Chunlei Peng et.al. | 2401.05646v2 | link |
2024-01-11 | DrawTalking: Building Interactive Worlds by Sketching and Speaking | Karl Toby Rosenberg et.al. | 2401.05631v1 | null |
2024-01-10 | Modality-Aware Representation Learning for Zero-shot Sketch-based Image Retrieval | Eunyi Lyou et.al. | 2401.04860v1 | null |
2024-01-09 | Content-Conditioned Generation of Stylized Free hand Sketches | Jiajun Liu et.al. | 2401.04739v1 | null |
2024-01-09 | Representative Feature Extraction During Diffusion Process for Sketch Extraction with One Example | Kwan Yun et.al. | 2401.04362v1 | null |
2024-01-08 | Flowmind2Digital: The First Comprehensive Flowmind Recognition and Conversion Approach | Huanyu Liu et.al. | 2401.03742v1 | link |
2024-01-05 | FedNS: A Fast Sketching Newton-Type Algorithm for Federated Learning | Jian Li et.al. | 2401.02734v1 | link |
2024-01-02 | ColorizeDiffusion: Adjustable Sketch Colorization with Reference Image and Text | Dingkun Yan et.al. | 2401.01456v1 | link |
2024-01-01 | Free-form Shape Modeling in XR: A Systematic Review | Shounak Chatterjee et.al. | 2401.00924v1 | null |
2024-01-01 | DiffMorph: Text-less Image Morphing with Diffusion Models | Shounak Chatterjee et.al. | 2401.00739v1 | null |
2023-12-31 | SynCDR : Training Cross Domain Retrieval Models with Synthetic Data | Samarth Mishra et.al. | 2401.00420v1 | link |
2023-12-31 | Multi-Granularity Representation Learning for Sketch-based Dynamic Face Image Retrieval | Liang Wang et.al. | 2401.00371v1 | link |
2023-12-28 | A randomized algorithm to solve reduced rank operator regression | Giacomo Turri et.al. | 2312.17348v1 | link |
2024-01-03 | SVGDreamer: Text Guided SVG Generation with Diffusion Model | Ximing Xing et.al. | 2312.16476v2 | link |
2023-12-22 | Generative AI and the History of Architecture | Joern Ploennigs et.al. | 2312.15106v1 | null |
2023-12-22 | A Modular Approach to Metatheoretic Reasoning for Extensible Languages | Dawn Michaelson et.al. | 2312.14374v1 | null |
2023-12-21 | On the Hardness of Analyzing Quantum Programs Quantitatively | Martin Avanzini et.al. | 2312.13657v1 | null |
2023-12-18 | Open Vocabulary Semantic Scene Sketch Understanding | Ahmed Bourouis et.al. | 2312.12463v1 | null |
2023-12-19 | Sketch Vision: Artificial Intelligence with Sight for Imagination | Demircan Tas et.al. | 2312.12270v1 | null |
2023-12-19 | Brush Your Text: Synthesize Any Scene Text on Images via Diffusion Model | Lingjun Zhang et.al. | 2312.12232v1 | link |
2023-12-19 | CreativeConnect: Supporting Reference Recombination for Graphic Design Ideation with Generative AI | DaEun Choi et.al. | 2312.11949v1 | null |
2023-12-16 | Symmetrical Bidirectional Knowledge Alignment for Zero-Shot Sketch-Based Image Retrieval | Decheng Liu et.al. | 2312.10320v1 | link |
2023-12-15 | Sketch and shift: a robust decoder for compressive clustering | Ayoub Belhadji et.al. | 2312.09940v1 | null |
2023-12-15 | Structural Information Guided Multimodal Pre-training for Vehicle-centric Perception | Xiao Wang et.al. | 2312.09812v1 | link |
2023-12-14 | Matching Noisy Keys for Obfuscation | Charlie Dickens et.al. | 2312.08981v1 | null |
2023-12-14 | Solving Dense Linear Systems Faster than via Preconditioning | Michał Dereziński et.al. | 2312.08893v1 | null |
2023-12-13 | Enhance Sketch Recognition’s Explainability via Semantic Component-Level Parsing | Guangming Zhu et.al. | 2312.07875v1 | link |
2023-12-12 | Improved Frequency Estimation Algorithms with and without Predictions | Anders Aamand et.al. | 2312.07535v1 | null |
2023-12-09 | BARET : Balanced Attention based Real image Editing driven by Target-text Inversion | Yuming Qiao et.al. | 2312.05482v1 | null |
2023-12-07 | Optimal Multi-Pass Lower Bounds for MST in Dynamic Streams | Sepehr Assadi et.al. | 2312.04674v1 | null |
2023-12-07 | Deep3DSketch: 3D modeling from Free-hand Sketches with View- and Structural-Aware Adversarial Training | Tianrun Chen et.al. | 2312.04435v1 | null |
2023-12-07 | DemoCaricature: Democratising Caricature Generation with a Rough Sketch | Dar-Yen Chen et.al. | 2312.04364v1 | null |
2023-12-07 | Doodle Your 3D: From Abstract Freehand Sketches to Precise 3D Shapes | Hmrishav Bandyopadhyay et.al. | 2312.04043v1 | null |
2023-12-06 | CAFE: Towards Compact, Adaptive, and Fast Embedding for Large-scale Recommendation Models | Hailin Zhang et.al. | 2312.03256v1 | link |
2023-12-05 | SEVA: Leveraging sketches to evaluate alignment between human and machine visual abstraction | Kushin Mukherjee et.al. | 2312.03035v1 | link |
2023-12-08 | FreestyleRet: Retrieving Images from Style-Diversified Queries | Hao Li et.al. | 2312.02428v2 | link |
2023-12-04 | CLIPDrawX: Primitive-based Explanations for Text Guided Sketch Synthesis | Nityanand Mathur et.al. | 2312.02345v1 | null |
2023-12-03 | Uncertainty-biased molecular dynamics for learning uniformly accurate interatomic potentials | Viktor Zaverkin et.al. | 2312.01416v1 | null |
2023-11-30 | Sketch Input Method Editor: A Comprehensive Dataset and Methodology for Systematic Input Recognition | Guangming Zhu et.al. | 2311.18254v1 | link |
2023-11-29 | Analyzing Query Optimizer Performance in the Presence and Absence of Cardinality Estimates | Asoke Datta et.al. | 2311.17293v1 | null |
2023-11-28 | Time- and Communication-Efficient Overlay Network Construction via Gossip | Fabien Dufoulon et.al. | 2311.17115v1 | null |
2023-11-28 | SparseCtrl: Adding Sparse Controls to Text-to-Video Diffusion Models | Yuwei Guo et.al. | 2311.16933v1 | null |
2023-11-28 | ContextSeg: Sketch Semantic Segmentation by Querying the Context with Attention | Jiawei Wang et.al. | 2311.16682v1 | null |
2023-11-28 | Text-Driven Image Editing via Learnable Regions | Yuanze Lin et.al. | 2311.16432v1 | link |
2023-11-27 | MAST: Model-Agnostic Sparsified Training | Yury Demidovich et.al. | 2311.16086v1 | link |
2023-11-26 | Sketch Video Synthesis | Yudian Zheng et.al. | 2311.15306v1 | link |
2023-11-25 | A unified framework for learning with nonlinear model classes from arbitrary linear samples | Ben Adcock et.al. | 2311.14886v1 | null |
2023-11-24 | Data-to-Text Bilingual Generation | Guy Lapalme et.al. | 2311.14808v1 | link |
2023-11-24 | One Pass Streaming Algorithm for Super Long Token Attention Approximation in Sublinear Space | Raghav Addanki et.al. | 2311.14652v1 | null |
2023-11-21 | Breathing Life Into Sketches Using Text-to-Video Priors | Rinon Gal et.al. | 2311.13608v1 | null |
2023-11-22 | Adaptive Sampling for Deep Learning via Efficient Nonparametric Proxies | Shabnam Daghaghi et.al. | 2311.13583v1 | null |
2023-11-21 | From Concept to Manufacturing: Evaluating Vision-Language Models for Engineering Design | Cyril Picard et.al. | 2311.12668v1 | null |
2023-11-19 | AutoStory: Generating Diverse Storytelling Images with Minimal Human Effort | Wen Wang et.al. | 2311.11243v1 | null |
2023-11-17 | Scaling TabPFN: Sketching and Feature Selection for Tabular Prior-Data Fitted Networks | Benjamin Feuer et.al. | 2311.10609v1 | null |
2023-11-09 | Chain of Images for Intuitively Reasoning | Fanxu Meng et.al. | 2311.09241v1 | link |
2023-11-14 | Language and Sketching: An LLM-driven Interactive Multimodal Multitask Robot Navigation Framework | Weiqin Zu et.al. | 2311.08244v1 | null |
2023-11-13 | Fast and Space-Efficient Parallel Algorithms for Influence Maximization | Letong Wang et.al. | 2311.07554v1 | link |
2023-11-13 | Sketch-based Video Object Segmentation: Benchmark and Analysis | Ruolin Yang et.al. | 2311.07261v1 | null |
2023-11-09 | General Policies, Subgoal Structure, and Planning Width | Blai Bonet et.al. | 2311.05490v1 | null |
2023-11-09 | Control3D: Towards Controllable Text-to-3D Generation | Yang Chen et.al. | 2311.05461v1 | null |
2023-11-08 | Prompt Sketching for Large Language Models | Luca Beurer-Kellner et.al. | 2311.04954v1 | null |
2023-11-07 | DeepPatent2: A Large-Scale Benchmarking Corpus for Technical Drawing Understanding | Kehinde Ajayi et.al. | 2311.04098v1 | link |
2023-11-06 | Sketching methods with small window guarantee using minimum decycling sets | Guillaume Marçais et.al. | 2311.03592v1 | link |
2023-11-05 | Sketching Multidimensional Time Series for Fast Discord Mining | Chin-Chia Michael Yeh et.al. | 2311.03393v1 | null |
2023-11-03 | Neural Collage Transfer: Artistic Reconstruction via Material Manipulation | Ganghun Lee et.al. | 2311.02202v1 | link |
2023-11-06 | RT-Trajectory: Robotic Task Generalization via Hindsight Trajectory Sketches | Jiayuan Gu et.al. | 2311.01977v2 | null |
2023-11-03 | Hardness of Low Rank Approximation of Entrywise Transformed Matrix Products | Tamas Sarlos et.al. | 2311.01960v1 | null |
2023-11-03 | Towards Concept-Aware Large Language Models | Chen Shani et.al. | 2311.01866v1 | link |
2023-11-07 | inkn’hue: Enhancing Manga Colorization from Multiple Priors with Alignment Multi-Encoder VAE | Tawin Jiramahapokee et.al. | 2311.01804v2 | link |
2023-10-31 | Progress and outlook on advanced fly scans based on Mamba | Peng-Cheng Li et.al. | 2310.20106v1 | link |
2023-10-30 | The Expressibility of Polynomial based Attention Scheme | Zhao Song et.al. | 2310.20051v1 | null |
2023-10-29 | Sketching Algorithms for Sparse Dictionary Learning: PTAS and Turnstile Streaming | Gregory Dexter et.al. | 2310.19068v1 | null |
2023-10-29 | Customize StyleGAN with One Hand Sketch | Shaocong Zhang et.al. | 2310.18949v1 | null |
2023-10-28 | Deep3DSketch+: Obtaining Customized 3D Model by Single Free-Hand Sketch through Deep Learning | Ying Zang et.al. | 2310.18609v1 | null |
2023-10-27 | Deep3DSketch++: High-Fidelity 3D Modeling from Single Free-hand Sketches | Ying Zang et.al. | 2310.18178v1 | null |
2023-10-27 | Reality3DSketch: Rapid 3D Modeling of Objects from Single Freehand Sketches | Tianrun Chen et.al. | 2310.18148v1 | null |
2023-10-27 | On General Language Understanding | David Schlangen et.al. | 2310.18038v1 | null |
2023-10-27 | Sketching and Streaming for Dictionary Compression | Ruben Becker et.al. | 2310.17980v1 | null |
2023-10-26 | Skill-Mix: a Flexible and Expandable Family of Evaluations for AI models | Dingli Yu et.al. | 2310.17567v1 | null |
2023-10-24 | Emergent Communication in Interactive Sketch Question Answering | Zixing Lei et.al. | 2310.15597v1 | link |
2023-10-24 | Fast multiplication of random dense matrices with fixed sparse matrices | Tianyu Liang et.al. | 2310.15419v1 | link |
2023-10-18 | A Comprehensive Survey on Vector Database: Storage and Retrieval Technique, Challenge | Yikun Han et.al. | 2310.11703v1 | null |
2023-10-17 | Matrix Compression via Randomized Low Rank and Low Precision Factorization | Rajarshi Saha et.al. | 2310.11028v1 | link |
2023-10-16 | HairCLIPv2: Unifying Hair Editing via Proxy Feature Blending | Tianyi Wei et.al. | 2310.10651v1 | link |
2023-10-16 | Visual Data-Type Understanding does not emerge from Scaling Vision-Language Models | Vishaal Udandarao et.al. | 2310.08577v2 | link |
2023-10-12 | Visualizing a Nondeterministic to Deterministic Finite-State Machine Transformation | Tijana Minic et.al. | 2310.08248v1 | link |
2023-10-11 | On $(1+\varepsilon)$ -Approximate Flow Sparsifiers | Yu Chen et.al. | 2310.07857v1 | null |
2023-10-10 | SketchBodyNet: A Sketch-Driven Multi-faceted Decoder Network for 3D Human Reconstruction | Fei Wang et.al. | 2310.06577v1 | link |
2023-10-15 | HyperLips: Hyper Control Lips with High Resolution Decoder for Talking Face Generation | Yaosen Chen et.al. | 2310.05720v3 | link |
2023-10-09 | Logic-guided Deep Reinforcement Learning for Stock Trading | Zhiming Li et.al. | 2310.05551v1 | null |
2023-10-08 | Transforming Pixels into a Masterpiece: AI-Powered Art Restoration using a Novel Distributed Denoising CNN (DDCNN) | Sankar B. et.al. | 2310.05270v1 | null |
2023-10-06 | Hanging in there: Prenatal origins of antigravity homeostasis in humans | Nicholas M. Wilkinson et.al. | 2310.04168v1 | null |
2023-10-06 | Deterministic Clustering in High Dimensional Spaces: Sketches and Approximation | Vincent Cohen-Addad et.al. | 2310.04076v1 | null |
2023-10-05 | Matrix Completion from One-Bit Dither Samples | Arian Eamaz et.al. | 2310.03224v1 | null |
2023-10-04 | Streaming Euclidean $k$-median and $k$-means with $o(\log n)$ Space | Vincent Cohen-Addad et.al. | 2310.02882v1 | null |
2023-10-04 | On the tilt of the Earth’s polar axis ( $κλιμα$ ): Some ‘impressionist’ remarks | V. Courtillot et.al. | 2310.02768v1 | null |
2023-10-03 | View-Independent Adjoint Light Tracing for Lighting Design Optimization | Lukas Lipp et.al. | 2310.02043v1 | null |
2023-10-03 | Randomized Dimension Reduction with Statistical Guarantees | Yijun Dong et.al. | 2310.01739v1 | null |
2023-10-02 | PolySketchFormer: Fast Transformers via Sketches for Polynomial Kernels | Praneeth Kacham et.al. | 2310.01655v1 | null |
2023-09-29 | Toward Operationalizing Pipeline-aware ML Fairness: A Research Agenda for Developing Practical Guidelines and Tools | Emily Black et.al. | 2309.17337v1 | null |
2023-09-28 | Sketch2CADScript: 3D Scene Reconstruction from 2D Sketch using Visual Transformer and Rhino Grasshopper | Hong-Bin Yang et.al. | 2309.16850v1 | null |
2023-09-28 | Multi-Modal Financial Time-Series Retrieval Through Latent Space Projections | Tom Bamford et.al. | 2309.16741v1 | null |
2023-09-28 | Language models in molecular discovery | Nikita Janakarajan et.al. | 2309.16235v1 | null |
2023-10-01 | Sampling Methods for Inner Product Sketching | Majid Daliri et.al. | 2309.16157v2 | link |
2023-09-27 | Fast Locality Sensitive Hashing with Theoretical Guarantee | Zongyuan Tan et.al. | 2309.15479v1 | null |
2023-09-25 | Guess & Sketch: Language Model Guided Transpilation | Celine Lee et.al. | 2309.14396v1 | null |
2023-09-22 | Deep3DSketch+: Rapid 3D Modeling from Single Free-hand Sketches | Tianrun Chen et.al. | 2309.13006v1 | null |
2023-09-22 | Visualization According to Statisticians: An Interview Study on the Role of Visualization for Inferential Statistics | Eric Newburger et.al. | 2309.12684v1 | null |
2023-09-22 | Towards medhub: A Self-Service Platform for Analysts and Physicians | Markus Höhn et.al. | 2309.11234v2 | null |
2023-09-20 | An Empirical Study of Malicious Code In PyPI Ecosystem | Wenbo Guo et.al. | 2309.11021v1 | link |
2023-09-19 | An overview of some mathematical techniques and problems linking 3D vision to 3D printing | Emiliano Cristiani et.al. | 2309.10549v1 | null |
2023-09-19 | Learning Orbitally Stable Systems for Diagrammatically Teaching | Weiming Zhi et.al. | 2309.10298v1 | null |
2023-09-18 | Completeness Thresholds for Memory Safety: Unbounded Guarantees via Bounded Proofs (Extended Abstract) | Tobias Reinhard et.al. | 2309.09731v1 | null |
2023-09-18 | Applying Security Testing Techniques to Automotive Engineering | Irdin Pekaric et.al. | 2309.09647v1 | null |
2023-09-15 | Active Learning for Fine-Grained Sketch-Based Image Retrieval | Himanshu Thakur et.al. | 2309.08743v1 | null |
2023-09-15 | Beyond Domain Gap: Exploiting Subjectivity in Sketch-Based Person Retrieval | Kejun Lin et.al. | 2309.08372v1 | link |
2023-09-14 | Landscape-Sketch-Step: An AI/ML-Based Metaheuristic for Surrogate Optimization Problems | Rafael Monteiro et.al. | 2309.07936v1 | link |
2023-09-12 | Grounded Language Acquisition From Object and Action Imagery | James Robert Kubricht et.al. | 2309.06335v1 | null |
2023-09-12 | OmniSketch: Efficient Multi-Dimensional High-Velocity Stream Analytics with Arbitrary Predicates | Wieger R. Punter et.al. | 2309.06051v1 | null |
2023-09-12 | GA-Sketching: Shape Modeling from Multi-View Sketching with Geometry-Aligned Deep Implicit Functions | Jie Zhou et.al. | 2309.05946v1 | link |
2023-09-11 | Photodetachment dynamics using nonlocal dicrete-state-in-continuum model | Martin Čížek et.al. | 2309.05830v1 | null |
2023-09-10 | Streaming Semidefinite Programs: $O(\sqrt{n})$ Passes, Small Space and Fast Runtime | Zhao Song et.al. | 2309.05135v1 | null |
2023-09-08 | Receiving an algorithmic recommendation based on documentary filmmaking techniques | Samuel Gantier et.al. | 2309.04184v1 | null |
2023-09-07 | Learning from Demonstration via Probabilistic Diagrammatic Teaching | Weiming Zhi et.al. | 2309.03835v1 | null |
2023-09-07 | Adjacency Sketches in Adversarial Environments | Moni Naor et.al. | 2309.03728v1 | null |
2023-09-06 | An Evaluation of Software Sketches | Roy Friedman et.al. | 2309.03045v1 | null |
2023-09-03 | Business Process Text Sketch Automation Generation Using Large Language Model | Rui Zhu et.al. | 2309.01071v1 | null |
2023-09-02 | Online Adaptive Mahalanobis Distance Estimation | Lianke Qin et.al. | 2309.01030v1 | null |
2023-09-01 | Randomized Polar Codes for Anytime Distributed Machine Learning | Burak Bartan et.al. | 2309.00682v1 | null |
2023-09-01 | Human-Inspired Facial Sketch Synthesis with Dynamic Adaptation | Fei Gao et.al. | 2309.00216v1 | link |
2023-08-31 | Terrain Diffusion Network: Climatic-Aware Terrain Generation with Geological Sketch Guidance | Zexin Hu et.al. | 2308.16725v1 | null |
2023-08-30 | Surrogate-based Autotuning for Randomized Sketching Algorithms in Regression Problems | Younghyun Cho et.al. | 2308.15720v1 | null |
2023-08-27 | SketchDreamer: Interactive Text-Augmented Creative Sketch Ideation | Zhiyu Qu et.al. | 2308.14191v1 | link |
2023-08-25 | WorldSmith: Iterative and Expressive Prompting for World Building with a Generative AI | Hai Dang et.al. | 2308.13355v1 | null |
2023-08-25 | Bridging the Gap: Fine-to-Coarse Sketch Interpolation Network for High-Quality Animation Sketch Inbetweening | Jiaming Shen et.al. | 2308.13273v1 | null |
2023-08-21 | Geo-Sketcher: Rapid 3D Geological Modeling using Geological and Topographic Map Sketches | Ronan Amorim et.al. | 2308.12152v1 | null |
2023-08-24 | Bayesian Learning for Dynamic Target Localization with Human-provided Spatial Information | Min-Won Seo et.al. | 2308.11839v2 | null |
2023-08-22 | MatFuse: Controllable Material Generation with Diffusion Models | Giuseppe Vecchio et.al. | 2308.11408v1 | link |
2023-08-22 | Minwise-Independent Permutations with Insertion and Deletion of Features | Rameshwar Pratap et.al. | 2308.11240v1 | null |
2023-08-28 | Large Language Models for Software Engineering: A Systematic Literature Review | Xinyi Hou et.al. | 2308.10620v2 | null |
2023-08-16 | Freedom of Speech and AI Output | Eugene Volokh et.al. | 2308.08673v1 | null |
2023-08-16 | Painter: Teaching Auto-regressive Language Models to Draw Sketches | Reza Pourreza et.al. | 2308.08520v1 | null |
2023-08-15 | Inversion-by-Inversion: Exemplar-based Sketch-to-Photo Synthesis via Stochastic Differential Equations without Training | Ximing Xing et.al. | 2308.07665v1 | link |
2023-08-11 | Masked-Attention Diffusion Guidance for Spatially Controlling Text-to-Image Generation | Yuki Endo et.al. | 2308.06027v1 | link |
2023-08-11 | Uncertainty-Aware Cross-Modal Transfer Network for Sketch-Based 3D Shape Retrieval | Yiyang Cai et.al. | 2308.05948v1 | null |
2023-08-20 | The Fast and the Private: Task-based Dataset Search | Zezhou Huang et.al. | 2308.05637v2 | null |
2023-08-12 | LayoutLLM-T2I: Eliciting Layout Guidance from LLM for Text-to-Image Generation | Leigang Qu et.al. | 2308.05095v2 | null |
2023-08-10 | Apple Vision Pro for Healthcare: “The Ultimate Display”? – Entering the Wonderland of Precision | Jan Egger et.al. | 2308.04313v3 | null |
2023-08-08 | Iterative Sketching for Secure Coded Regression | Neophytos Charalambides et.al. | 2308.04185v1 | null |
2023-08-06 | Gradient Coding through Iterative Block Leverage Score Sampling | Neophytos Charalambides et.al. | 2308.03096v1 | null |
2023-08-05 | Sketch and Text Guided Diffusion Model for Colored Point Cloud Generation | Zijie Wu et.al. | 2308.02874v1 | null |
2023-08-07 | SoK: The Ghost Trilemma | S. Mukherjee et.al. | 2308.02202v2 | null |
2023-08-07 | BEVControl: Accurately Controlling Street-view Elements with Multi-perspective Consistency via BEV Sketch Layout | Kairui Yang et.al. | 2308.01661v3 | null |
2023-08-03 | PPI-NET: End-to-End Parametric Primitive Inference | Liang Wang et.al. | 2308.01521v1 | null |
2023-08-01 | Neural approximation of Wasserstein distance via a universal architecture for symmetric and factorwise group invariant functions | Samantha Chen et.al. | 2308.00273v1 | null |
2023-08-01 | CONSTRUCT: A Program Synthesis Approach for Reconstructing Control Algorithms from Embedded System Binaries in Cyber-Physical Systems | Ali Shokri et.al. | 2308.00250v1 | null |
2023-07-30 | RealityCanvas: Augmented Reality Sketching for Embedded and Responsive Scribble Animation Effects | Zhijie Xia et.al. | 2307.16116v1 | link |
2023-07-25 | Federated Heavy Hitter Recovery under Linear Sketching | Adria Gascon et.al. | 2307.13347v1 | null |
2023-07-24 | Learning Dense Correspondences between Photos and Sketches | Xuanchen Lu et.al. | 2307.12967v1 | null |
2023-07-18 | Semi-supervised Cycle-GAN for face photo-sketch translation in the wild | Chaofeng Chen et.al. | 2307.10281v1 | null |
2023-07-14 | Volumetric Wireframe Parsing from Neural Attraction Fields | Nan Xue et.al. | 2307.10206v1 | link |
2023-07-17 | Multi-Domain Learning with Modulation Adapters | Ekaterina Iakovleva et.al. | 2307.08528v1 | null |
2023-07-16 | InkSight: Leveraging Sketch Interaction for Documenting Chart Findings in Computational Notebooks | Yanna Lin et.al. | 2307.07922v1 | null |
2023-07-13 | Connectivity Labeling for Multiple Vertex Failures | Merav Parter et.al. | 2307.06276v2 | null |
2023-07-10 | Some Preliminary Steps Towards Metaverse Logic | Antonio L. Furtado et.al. | 2307.05574v1 | null |
2023-07-11 | A “Game of Like” : Online Social Network Sharing As Strategic Interaction | Emmanuel J. Genot et.al. | 2307.05063v1 | null |
2023-07-11 | Diffusion idea exploration for art generation | Nikhil Verma et.al. | 2307.04978v1 | null |
2023-07-08 | Sketch-A-Shape: Zero-Shot Sketch-to-3D Shape Generation | Aditya Sanghi et.al. | 2307.03869v1 | null |
2023-07-06 | Wireless Multi-Agent Generative AI: From Connected Intelligence to Collective Intelligence | Hang Zou et.al. | 2307.02757v1 | null |
2023-07-04 | Text + Sketch: Image Compression at Ultra Low Rates | Eric Lei et.al. | 2307.01944v1 | link |
2023-07-03 | Digital Twin-Empowered Communications: A New Frontier of Wireless Networks | Lina Bariah et.al. | 2307.00973v1 | null |
2023-07-04 | SketchMetaFace: A Learning-based Sketching Interface for High-fidelity 3D Character Face Modeling | Zhongjin Luo et.al. | 2307.00804v2 | null |
2023-06-27 | Cartesian institutions with evidence: Data and system modelling with diagrammatic constraints and generalized sketches | Zinovy Diskin et.al. | 2306.16284v1 | null |
2023-06-26 | Towards Optimal Effective Resistance Estimation | Rajat Vadiraj Dwaraknath et.al. | 2306.14820v1 | null |
2023-06-26 | DiffSketcher: Text Guided Vector Sketch Synthesis through Latent Diffusion Models | Ximing Xing et.al. | 2306.14685v1 | link |
2023-06-25 | ALBUS: a Probabilistic Monitoring Algorithm to Counter Burst-Flood Attacks | Simon Scherrer et.al. | 2306.14328v1 | null |
2023-06-24 | Full Automation of Goal-driven LLM Dialog Threads with And-Or Recursors and Refiner Oracles | Paul Tarau et.al. | 2306.14077v1 | link |
2023-06-21 | PrivSketch: A Private Sketch-based Frequency Estimation Protocol for Data Streams | Ying Li et.al. | 2306.12144v1 | null |
2023-06-20 | Computing a human-like reaction time metric from stable recurrent vision models | Lore Goetschalckx et.al. | 2306.11582v1 | null |
2023-06-23 | 3D VR Sketch Guided 3D Shape Prototyping and Exploration | Ling Luo et.al. | 2306.10830v2 | link |
2023-06-19 | Shape Guided Gradient Voting for Domain Generalization | Jiaqi Xu et.al. | 2306.10809v1 | null |
2023-06-15 | Private Federated Frequency Estimation: Adapting to the Hardness of the Instance | Jingfeng Wu et.al. | 2306.09396v1 | null |
2023-06-15 | Conditional Human Sketch Synthesis with Explicit Abstraction Control | Dar-Yen Chen et.al. | 2306.09274v1 | null |
2023-06-15 | Behaviorally Typed State Machines in TypeScript for Heterogeneous Swarms | Roland Kuhn et.al. | 2306.09068v1 | link |
2023-06-15 | Interleaving Pre-Trained Language Models and Large Language Models for Zero-Shot NL2SQL Generation | Zihui Gu et.al. | 2306.08891v1 | link |
2023-06-14 | Zero-Shot 3D Shape Sketch View Similarity and Retrieval | Gianluca Berardi et.al. | 2306.08541v1 | null |
2023-06-14 | Probing the unfolded configurations of a $β$ -hairpin using sketch-map | Albert Ardevol et.al. | 2306.08429v1 | null |
2023-06-14 | CLIPXPlore: Coupled CLIP and Shape Spaces for 3D Shape Exploration | Jingyu Hu et.al. | 2306.08226v1 | null |
2023-06-13 | AniFaceDrawing: Anime Portrait Exploration during Your Sketching | Zhengyu Huang et.al. | 2306.07476v1 | null |
2023-06-15 | Strokes2Surface: Recovering Curve Networks From 4D Architectural Design Sketches | S. Rasoulzadeh et.al. | 2306.07220v2 | link |
2023-06-11 | Learning the Positions in CountSketch | Yi Li et.al. | 2306.06611v1 | null |
2023-06-09 | SENS: Sketch-based Implicit Neural Shape Modeling | Alexandre Binninger et.al. | 2306.06088v1 | null |
2023-06-09 | Sketch2Stress: Sketching with Structural Stress Awareness | Deng Yu et.al. | 2306.05911v1 | null |
2023-06-09 | Sketch Beautification: Learning Part Beautification and Structure Refinement for Sketches of Man-made Objects | Deng Yu et.al. | 2306.05832v1 | null |
2023-06-05 | Tracking Evolving labels using Cone based Oracles | Aditya Acharya et.al. | 2306.03306v1 | null |
2023-06-09 | Explicit Construction of q-ary 2-deletion Correcting Codes with Low Redundancy | Shu Liu et.al. | 2306.02868v2 | null |
2023-06-06 | VideoComposer: Compositional Video Synthesis with Motion Controllability | Xiang Wang et.al. | 2306.02018v2 | null |
2023-06-07 | Cross Modal Data Discovery over Structured and Unstructured Data Lakes | Mohamed Y. Eltabakh et.al. | 2306.00932v2 | link |
2023-06-01 | Towards Interactive Image Inpainting via Sketch Refinement | Chang Liu et.al. | 2306.00407v1 | link |
2023-06-01 | Faster Robust Tensor Power Method for Arbitrary Order | Yichuan Deng et.al. | 2306.00406v1 | null |
2023-05-31 | Knowledge Base Question Answering for Space Debris Queries | Paul Darm et.al. | 2305.19734v1 | link |
2023-05-30 | A Recipe for Efficient SBIR Models: Combining Relative Triplet Loss with Batch Normalization and Knowledge Distillation | Omar Seddati et.al. | 2305.18988v1 | null |
2023-05-30 | DiffSketching: Sketch Control Image Synthesis with Diffusion Models | Qiang Wang et.al. | 2305.18812v1 | link |
2023-05-30 | Generalization Bounds for Magnitude-Based Pruning via Sparse Matrix Sketching | Etash Kumar Guha et.al. | 2305.18789v1 | null |
2023-05-29 | Controllable Text-to-Image Generation with GPT-4 | Tianjun Zhang et.al. | 2305.18583v1 | null |
2023-05-29 | ANPL: Compiling Natural Programs with Interactive Decomposition | Di Huang et.al. | 2305.18498v1 | link |
2023-05-30 | TaleCrafter: Interactive Story Visualization with Multiple Characters | Yuan Gong et.al. | 2305.18247v2 | link |
2023-05-27 | Pruning at Initialization – A Sketching Perspective | Noga Bar et.al. | 2305.17559v1 | null |
2023-05-27 | On the Noise Sensitivity of the Randomized SVD | Elad Romanov et.al. | 2305.17435v1 | link |
2023-05-26 | BIG-C: a Multimodal Multi-Purpose Dataset for Bemba | Claytone Sikasote et.al. | 2305.17202v1 | link |
2023-05-26 | CARAMEL: A Succinct Read-Only Lookup Table via Compressed Static Functions | Benjamin Coleman et.al. | 2305.16545v1 | null |
2023-05-25 | SketchOGD: Memory-Efficient Continual Learning | Benjamin Wright et.al. | 2305.16424v1 | link |
2023-05-24 | DiffBlender: Scalable and Composable Multimodal Text-to-Image Diffusion Models | Sungnyun Kim et.al. | 2305.15194v1 | link |
2023-05-23 | Distributed CONGEST Algorithms against Mobile Adversaries | Orr Fischer et.al. | 2305.14300v1 | null |
2023-05-19 | MaGIC: Multi-modality Guided Image Completion | Yongsheng Yu et.al. | 2305.11818v1 | null |
2023-05-19 | MIDI-Draw: Sketching to Control Melody Generation | Tashi Namgyal et.al. | 2305.11605v1 | null |
2023-05-17 | Data Extraction via Semantic Regular Expression Synthesis | Qiaochu Chen et.al. | 2305.10401v1 | null |
2023-05-15 | Scalable and Robust Tensor Ring Decomposition for Large-scale Data | Yicong He et.al. | 2305.09044v1 | null |
2023-05-15 | Validity Constraints for Data Analysis Workflows | Florian Schintke et.al. | 2305.08409v1 | null |
2023-05-15 | Fast and Efficient Matching Algorithm with Deadline Instances | Zhao Song et.al. | 2305.08353v1 | null |
2023-05-15 | Approximation and Progressive Display of Multiverse Analyses | Yang Liu et.al. | 2305.08323v1 | null |
2023-05-11 | Enabling Programming Thinking in Large Language Models Toward Code Generation | Jia Li et.al. | 2305.06599v1 | null |
2023-05-12 | Searching Mobile App Screens via Text + Doodle | Soumik Mohian et.al. | 2305.06165v2 | link |
2023-05-10 | Sketching the Future (STF): Applying Conditional Control Techniques to Text-to-Video Models | Rohan Dhesikan et.al. | 2305.05845v1 | link |
2023-05-09 | Adapt and Align to Improve Zero-Shot Sketch-Based Image Retrieval | Shiyin Dong et.al. | 2305.05144v1 | null |
2023-05-08 | Behavioural Types for Local-First Software | Roland Kuhn et.al. | 2305.04848v1 | null |
2023-05-09 | Locally Attentional SDF Diffusion for Controllable 3D Shape Generation | Xin-Yang Zheng et.al. | 2305.04461v2 | null |
2023-05-08 | Oblivious algorithms for the Max- $k$ AND Problem | Noah G. Singer et.al. | 2305.04438v1 | null |
2023-05-05 | Towards Feminist Intersectional XAI: From Explainability to Response-Ability | Goda Klumbyte et.al. | 2305.03375v1 | null |
2023-05-04 | Program Synthesis for Robot Learning from Demonstrations | Noah Patton et.al. | 2305.03129v1 | null |
2023-05-04 | HAISTA-NET: Human Assisted Instance Segmentation Through Attention | Muhammed Korkmaz et.al. | 2305.03105v1 | null |
2023-05-04 | Controllable Visual-Tactile Synthesis | Ruihan Gao et.al. | 2305.03051v1 | link |
2023-05-02 | A Survey of Methods for Converting Unstructured Data to CSG Models | Pierre-Alain Fayolle et.al. | 2305.01220v1 | null |
2023-05-01 | IndoorSim-to-OutdoorReal: Learning to Navigate Outdoors without any Outdoor Experience | Joanne Truong et.al. | 2305.01098v1 | null |
2023-05-01 | Design and Evaluation of a Bioinspired Tendon-Driven 3D-Printed Robotic Eye with Active Vision Capabilities | Hamid Osooli et.al. | 2305.01076v1 | link |
2023-05-01 | semantic neural model approach for face recognition from sketch | Chandana Navuluri et.al. | 2305.01058v1 | null |
2023-04-25 | Bridging graph data models: RDF, RDF-star, and property graphs as directed acyclic graphs | Ewout Gelling et.al. | 2304.13097v1 | link |
2023-04-25 | DualSlide: Global-to-Local Sketching Interface for Slide Content and Layout Design | Jiahao Weng et.al. | 2304.12506v1 | null |
2023-04-23 | SketchXAI: A First Look at Explainability for Human Sketches | Zhiyu Qu et.al. | 2304.11744v1 | null |
2023-04-22 | (Vector) Space is Not the Final Frontier: Product Search as Program Synthesis | Jacopo Tagliabue et.al. | 2304.11473v1 | null |
2023-04-21 | The centaur programmer – How Kasparov’s Advanced Chess spans over to the software development of the future | Pedro Alves et.al. | 2304.11172v1 | null |
2023-04-19 | StyleDEM: a Versatile Model for Authoring Terrains | Simon Perche et.al. | 2304.09626v1 | null |
2023-04-19 | Sensitivity estimation for differentially private query processing | Meifan Zhang et.al. | 2304.09546v1 | null |
2023-04-19 | A Protocol for Cast-as-Intended Verifiability with a Second Device | Johannes Müller et.al. | 2304.09456v1 | null |
2023-04-18 | Optimal Eigenvalue Approximation via Sketching | William Swartworth et.al. | 2304.09281v1 | null |
2023-04-18 | GUILGET: GUI Layout GEneration with Transformer | Andrey Sobolevsky et.al. | 2304.09012v1 | link |
2023-04-18 | Coefficient Synthesis for Threshold Automata | A. R. Balasubramanian et.al. | 2304.08917v1 | null |
2023-04-18 | Online fair division with arbitrary entitlements | Kushagra Chatterjee et.al. | 2304.08864v1 | null |
2023-04-17 | Learning Geometry-aware Representations by Sketching | Hyundo Lee et.al. | 2304.08204v1 | null |
2023-04-15 | Learned Interpolation for Better Streaming Quantile Approximation with Worst-Case Guarantees | Nicholas Schiefer et.al. | 2304.07652v1 | null |
2023-04-15 | Remembering Ludwig Dmitrievich Faddeev, our lifelong partner in mathematical physics | Daniel Sternheimer et.al. | 2304.07577v1 | null |
2023-04-14 | Pool Inference Attacks on Local Differential Privacy: Quantifying the Privacy Guarantees of Apple’s Count Mean Sketch in Practice | Andrea Gadotti et.al. | 2304.07134v1 | null |
2023-04-14 | On deterministic, constant memory triangular searches on the integer lattice | J. Alfredo Cruz-Carlon et.al. | 2304.07033v1 | null |
2023-04-13 | Learning Controllable 3D Diffusion Models from Single-view Images | Jiatao Gu et.al. | 2304.06700v1 | null |
2023-04-13 | On streaming approximation algorithms for constraint satisfaction problems | Noah G. Singer et.al. | 2304.06664v1 | null |
2023-04-13 | Solving Tensor Low Cycle Rank Approximation | Yichuan Deng et.al. | 2304.06594v1 | null |
2023-04-12 | TextANIMAR: Text-based 3D Animal Fine-Grained Retrieval | Trung-Nghia Le et.al. | 2304.06053v1 | null |
2023-04-12 | SketchANIMAR: Sketch-based 3D Animal Fine-Grained Retrieval | Trung-Nghia Le et.al. | 2304.05731v1 | null |
2023-04-10 | Identity-Guided Collaborative Learning for Cloth-Changing Person Reidentification | Zan Gao et.al. | 2304.04400v1 | null |
2023-04-09 | On Extend-Only Directed Posets and Derived Byzantine-Tolerant Replicated Data Types (Extended Version) | Florian Jacob et.al. | 2304.04318v1 | null |
2023-04-07 | ChiroDiff: Modelling chirographic data with Diffusion Models | Ayan Das et.al. | 2304.03785v1 | null |
2023-04-06 | SketchFFusion: Sketch-guided image editing with diffusion model | Weihang Mao et.al. | 2304.03174v1 | null |
2023-04-06 | LSketch: A Label-Enabled Graph Stream Sketch Toward Time-Sensitive Queries | Yiling Zeng et.al. | 2304.02897v1 | null |
2023-04-05 | Tracing and Visualizing Human-ML/AI Collaborative Processes through Artifacts of Data Work | Jennifer Rogers and et.al. | 2304.02699v1 | null |
2023-04-05 | Beyond Summarization: Designing AI Support for Real-World Expository Writing Tasks | Zejiang Shen et.al. | 2304.02623v1 | null |
2023-04-05 | Optimal Sketching Bounds for Sparse Linear Regression | Tung Mai et.al. | 2304.02261v1 | null |
2023-04-05 | LogoNet: a fine-grained network for instance-level logo sketch retrieval | Binbin Feng et.al. | 2304.02214v1 | link |
2023-04-04 | Multimodal Garment Designer: Human-Centric Latent Diffusion Models for Fashion Image Editing | Alberto Baldrati et.al. | 2304.02051v1 | link |
2023-04-02 | Sketch-based Video Object Localization | Sangmin Woo et.al. | 2304.00450v1 | link |
2023-03-31 | Almost Linear Constant-Factor Sketching for $\ell_1$ and Logistic Regression | Alexander Munteanu et.al. | 2304.00051v1 | link |
2023-03-30 | If At First You Don’t Succeed: Test Time Re-ranking for Zero-shot, Cross-domain Retrieval | Finlay G. C. Hudson et.al. | 2303.17703v1 | null |
2023-03-30 | Methods and advancement of content-based fashion image retrieval: A Review | Amin Muhammad Shoib et.al. | 2303.17371v1 | null |
2023-03-29 | Sketch-an-Anchor: Sub-epoch Fast Model Adaptation for Zero-shot Sketch-based Image Retrieval | Leo Sampaio Ferraz Ribeiro et.al. | 2303.16769v1 | null |
2023-03-28 | Visual Chain-of-Thought Diffusion Models | William Harvey et.al. | 2303.16187v1 | link |
2023-03-27 | What Can Human Sketches Do for Object Detection? | Pinaki Nath Chowdhury et.al. | 2303.15149v1 | null |
2023-03-25 | Zero-Shot Everything Sketch-Based Image Retrieval, and in Explainable Style | Fengyin Lin et.al. | 2303.14348v1 | link |
2023-03-24 | Feature Space Sketching for Logistic Regression | Gregory Dexter et.al. | 2303.14284v1 | null |
2023-03-24 | Exploiting Unlabelled Photos for Stronger Fine-Grained SBIR | Aneeshan Sain et.al. | 2303.13779v1 | null |
2023-03-24 | The First Computer Program | Raúl Rojas et.al. | 2303.13740v1 | null |
2023-03-28 | CLIP for All Things Zero-Shot Sketch-Based Image Retrieval, Fine-Grained or Not | Aneeshan Sain et.al. | 2303.13440v3 | null |
2023-03-23 | Defining Quality Requirements for a Trustworthy AI Wildflower Monitoring Platform | Petra Heck et.al. | 2303.13151v1 | null |
2023-03-22 | Evaluation of Sketch-Based and Semantic-Based Modalities for Mockup Generation | Tommaso Calò et.al. | 2303.12709v1 | null |
2023-03-22 | An Extended Study of Human-like Behavior under Adversarial Training | Paul Gavrikov et.al. | 2303.12669v1 | link |
2023-03-24 | RaBit: Parametric Modeling of 3D Biped Cartoon Characters with a Topological-consistent Dataset | Zhongjin Luo et.al. | 2303.12564v2 | null |
2023-03-21 | Roots and Requirements for Collaborative AI | Mark Stefik et.al. | 2303.12040v1 | null |
2023-03-23 | Sketch2Saliency: Learning to Detect Salient Objects from Human Drawings | Ayan Kumar Bhunia et.al. | 2303.11502v2 | null |
2023-03-20 | Automatic Measures for Evaluating Generative Design Methods for Architects | Eric Yeh et.al. | 2303.11483v1 | null |
2023-03-20 | Picture that Sketch: Photorealistic Image Generation from Abstract Sketches | Subhadeep Koley et.al. | 2303.11162v1 | null |
2023-03-20 | On the Maximal Independent Sets of $k$ -mers with the Edit Distance | Leran Ma et.al. | 2303.10926v1 | link |
2023-03-19 | SKED: Sketch-guided Text-based 3D Editing | Aryan Mikaeili et.al. | 2303.10735v1 | null |
2023-03-19 | Trainable Projected Gradient Method for Robust Fine-tuning | Junjiao Tian et.al. | 2303.10720v1 | link |
2023-03-19 | EduVis: Workshop on Visualization Education, Literacy, and Activities | Mandy Keck et.al. | 2303.10708v1 | null |
2023-03-19 | SECAD-Net: Self-Supervised CAD Reconstruction by Learning Sketch-Extrude Operations | Pu Li et.al. | 2303.10613v1 | link |
2023-03-17 | PersonalTailor: Personalizing 2D Pattern Design from 3D Garment Point Clouds | Anran Qi et.al. | 2303.09695v1 | null |
2023-03-15 | Query-guided Attention in Vision Transformers for Localizing Objects Using a Single Sketch | Aditay Tripathi et.al. | 2303.08784v1 | null |
2023-03-15 | RIS-Enabled Smart Wireless Environments: Deployment Scenarios, Network Architecture, Bandwidth and Area of Influence | George C. Alexandropoulos et.al. | 2303.08505v1 | null |
2023-03-14 | Data-Free Sketch-Based Image Retrieval | Abhra Chaudhuri et.al. | 2303.07775v1 | link |
2023-03-13 | Can Workers Meaningfully Consent to Workplace Wellbeing Technologies? | Shreya Chowdhary et.al. | 2303.07242v1 | null |
2023-03-13 | An Improved Sample Complexity for Rank-1 Matrix Sensing | Yichuan Deng et.al. | 2303.06895v1 | null |
2023-03-10 | StyleGANEX: StyleGAN-Based Manipulation Beyond Cropped Aligned Faces | Shuai Yang et.al. | 2303.06146v1 | link |
2023-03-08 | Sketching with Spherical Designs for Noisy Data Fitting on Spheres | Shao-Bo Lin et.al. | 2303.04550v1 | null |
2023-03-08 | Models of symbol emergence in communication: a conceptual review and a guide for avoiding local minima | Julian Zubek et.al. | 2303.04544v1 | null |
2023-03-07 | Introspective Cross-Attention Probing for Lightweight Transfer of Pre-trained Models | Yonatan Dukler et.al. | 2303.04105v1 | null |
2023-03-06 | Data Portraits: Recording Foundation Model Training Data | Marc Marone et.al. | 2303.03919v1 | null |
2023-03-07 | Sketch-based Medical Image Retrieval | Kazuma Kobayashi et.al. | 2303.03633v1 | null |
2023-03-06 | Model Sketching: Centering Concepts in Early-Stage Machine Learning Model Design | Michelle S. Lam et.al. | 2303.02884v1 | link |
2023-03-05 | Text2Face: A Multi-Modal 3D Face Model | Will Rowan et.al. | 2303.02688v1 | null |
2023-03-03 | Graph-based Extreme Feature Selection for Multi-class Classification Tasks | Shir Friedman et.al. | 2303.01792v1 | null |
2023-03-02 | Coresets for Clustering in Geometric Intersection Graphs | Sayan Bandyapadhyay et.al. | 2303.01400v1 | null |
2023-03-01 | Sketch2Cloth: Sketch-based 3D Garment Generation with Unsigned Distance Fields | Yi He et.al. | 2303.00167v1 | null |
2023-02-26 | Towards Human-Bot Collaborative Software Architecting with ChatGPT | Aakash Ahmad et.al. | 2302.14600v1 | link |
2023-02-28 | On-the-Fly Communication-and-Computing for Distributed Tensor Decomposition Over MIMO Channels | Xu Chen et.al. | 2302.14297v1 | null |
2023-02-27 | Capstone: A Capability-based Foundation for Trustless Secure Memory Access (Extended Version) | Jason Zhijingcheng Yu et.al. | 2302.13863v1 | null |
2023-02-27 | Evaluation of Automatically Constructed Word Meaning Explanations | Marie Stará et.al. | 2302.13625v1 | null |
2023-02-26 | Scalable Weight Reparametrization for Efficient Transfer Learning | Byeonggeun Kim et.al. | 2302.13435v1 | null |
2023-02-24 | Modulating Pretrained Diffusion Models for Multimodal Image Synthesis | Cusuh Ham et.al. | 2302.12764v1 | null |
2023-02-23 | Using Colors and Sketches to Count Subgraphs in a Streaming Graph | Shirin Handjani et.al. | 2302.12210v1 | null |
2023-02-24 | A Scalable Space-efficient In-database Interpretability Framework for Embedding-based Semantic SQL Queries | Prabhakar Kudva et.al. | 2302.12178v2 | null |
2023-02-22 | A Reference Architecture for Observability and Compliance of Cloud Native Applications | William Pourmajidi et.al. | 2302.11617v1 | null |
2023-02-20 | Ontology-aware Network for Zero-shot Sketch-based Image Retrieval | Haoxiang Zhang et.al. | 2302.10040v1 | null |
2023-02-22 | Composer: Creative and Controllable Image Synthesis with Composable Conditions | Lianghua Huang et.al. | 2302.09778v2 | link |
2023-02-16 | Rejecting Cognitivism: Computational Phenomenology for Deep Learning | Pierre Beckmann et.al. | 2302.09071v1 | null |
2023-02-14 | DiffFaceSketch: High-Fidelity Face Image Synthesis with Sketch-Guided Latent Diffusion Model | Yichen Peng et.al. | 2302.06908v1 | link |
2023-02-14 | Text-Guided Scene Sketch-to-Photo Synthesis | AprilPyone MaungMaung et.al. | 2302.06883v1 | null |
2023-02-14 | Make Your Brief Stroke Real and Stereoscopic: 3D-Aware Simplified Sketch to Portrait Generation | Yasheng Sun et.al. | 2302.06857v1 | null |
2023-02-13 | SkCoder: A Sketch-based Approach for Automatic Code Generation | Jia Li et.al. | 2302.06144v1 | link |
2023-02-13 | Learning to Scale Temperature in Masked Self-Attention for Image Inpainting | Xiang Zhou et.al. | 2302.06130v1 | null |
2023-02-11 | An Evaluation Algorithm for Datalog with Equality | Martin E. Bidlingmaier et.al. | 2302.05792v1 | link |
2023-02-11 | Sketch Less Face Image Retrieval: A New Challenge | Dawei Dai et.al. | 2302.05576v1 | link |
2023-02-10 | MaskSketch: Unpaired Structure-guided Masked Image Generation | Dina Bashkirova et.al. | 2302.05496v1 | link |
2023-02-10 | Count-min sketch with variable number of hash functions: an experimental study | Éric Fusy et.al. | 2302.05245v1 | null |
2023-02-10 | Fast Gumbel-Max Sketch and its Applications | Yuanming Zhang et.al. | 2302.05176v1 | null |
2023-02-09 | Projection-free Online Exp-concave Optimization | Dan Garber et.al. | 2302.04859v1 | null |
2023-02-09 | Locally consistent decomposition of strings with applications to edit distance sketching | Sudatta Bhattacharya et.al. | 2302.04475v1 | null |
2023-02-06 | Sketching Robot Programs On the Fly | David Porfirio et.al. | 2302.03088v1 | null |
2023-02-05 | Leaving Reality to Imagination: Robust Classification via Generated Datasets | Hritik Bansal et.al. | 2302.02503v1 | link |
2023-02-04 | An Effective and Differentially Private Protocol for Secure Distributed Cardinality Estimation | Pinghui Wang et.al. | 2302.02158v1 | null |
2023-02-04 | Sketch-Flip-Merge: Mergeable Sketches for Private Distinct Counting | Jonathan Hehir et.al. | 2302.02056v1 | null |
2023-02-01 | A Nearly-Optimal Bound for Fast Regression with $\ell_\infty$ Guarantee | Zhao Song et.al. | 2302.00248v1 | null |
2023-01-31 | FLAME: A small language model for spreadsheet formulas | Harshit Joshi et.al. | 2301.13779v1 | null |
2023-01-30 | Streaming Anomaly Detection | Siddharth Bhatia et.al. | 2301.13199v1 | link |
2023-01-29 | BERT-based Authorship Attribution on the Romanian Dataset called ROST | Sanda-Maria Avram et.al. | 2301.12500v1 | null |
2023-01-26 | Synesthetic Dice: Sensors, Actuators, And Mappings | Albrecht Kurze et.al. | 2301.11436v1 | null |
2023-01-26 | Cut and Learn for Unsupervised Object Detection and Instance Segmentation | Xudong Wang et.al. | 2301.11320v1 | link |
2023-01-25 | Reflective Artificial Intelligence | Peter R. Lewis et.al. | 2301.10823v1 | null |
2023-01-25 | Distilling Text into Circuits | Vincent Wang-Mascianica et.al. | 2301.10595v1 | null |
2023-01-24 | Capacity Analysis of Vector Symbolic Architectures | Kenneth L. Clarkson et.al. | 2301.10352v1 | null |
2023-01-20 | Improving Sketch Colorization using Adversarial Segmentation Consistency | Samet Hicsonmez et.al. | 2301.08590v1 | link |
2023-01-19 | On Finite Blocklength Lossy Source Coding | Lin Zhou et.al. | 2301.07871v1 | null |
2023-01-17 | Vision Based Machine Learning Algorithms for Out-of-Distribution Generalisation | Hamza Riaz et.al. | 2301.06975v1 | null |
2023-01-17 | Distribution Aligned Feature Clustering for Zero-Shot Sketch-Based Image Retrieval | Yuchen Wu et.al. | 2301.06685v1 | null |
2023-01-16 | A Distributed Palette Sparsification Theorem | Maxime Flin et.al. | 2301.06457v1 | null |
2023-01-14 | Weighted Minwise Hashing Beats Linear Sketching for Inner Product Estimation | Aline Bessa et.al. | 2301.05811v1 | null |
2023-01-06 | Better Differentially Private Approximate Histograms and Heavy Hitters using the Misra-Gries Sketch | Christian Janos Lebeda et.al. | 2301.02457v1 | null |
2023-01-03 | EQUI-VOCAL: Synthesizing Queries for Compositional Video Events from Limited User Interactions [Technical Report] | Enhao Zhang et.al. | 2301.00929v1 | link |
2023-01-17 | Algorithms for Massive Data – Lecture Notes | Nicola Prezza et.al. | 2301.00754v2 | null |
2022-12-28 | Modular termination verification with a higher-order concurrent separation logic (Intermediate report) | Justus Fasse et.al. | 2212.14126v1 | null |
2022-12-22 | A Domain-Extensible Compiler with Controllable Automation of Optimisations | Thomas Koehler et.al. | 2212.12035v1 | null |
(<a href=#Updated-on-20240404>back to top</a>)
Publish Date | Title | Authors | Code | |
---|---|---|---|---|
2024-04-03 | Neural Radiance Fields with Torch Units | Bingnan Ni et.al. | 2404.02617v1 | null |
2024-04-03 | TCLC-GS: Tightly Coupled LiDAR-Camera Gaussian Splatting for Surrounding Autonomous Driving Scenes | Cheng Zhao et.al. | 2404.02410v1 | null |
2024-04-03 | APC2Mesh: Bridging the gap from occluded building façades to full 3D models | Perpetual Hope Akwensi et.al. | 2404.02391v1 | null |
2024-04-01 | Neural Implicit Representation for Building Digital Twins of Unknown Articulated Objects | Yijia Weng et.al. | 2404.01440v1 | link |
2024-04-01 | NVINS: Robust Visual Inertial Navigation Fused with NeRF-augmented Camera Pose Regressor and Uncertainty Quantification | Juyeop Han et.al. | 2404.01400v1 | null |
2024-04-01 | FPGA-Accelerated Correspondence-free Point Cloud Registration with PointNet Features | Keisuke Sugiura et.al. | 2404.01237v1 | null |
2024-04-02 | Few-shot point cloud reconstruction and denoising via learned Guassian splats renderings and fine-tuned diffusion features | Pietro Bonazzi et.al. | 2404.01112v2 | null |
2024-03-30 | DiffHuman: Probabilistic Photorealistic 3D Reconstruction of Humans | Akash Sengupta et.al. | 2404.00485v1 | null |
2024-03-30 | 3DGSR: Implicit Surface Reconstruction with 3D Gaussian Splatting | Xiaoyang Lyu et.al. | 2404.00409v1 | null |
2024-03-29 | Sparse Views, Near Light: A Practical Paradigm for Uncalibrated Point-light Photometric Stereo | Mohammed Brahimi et.al. | 2404.00098v1 | null |
2024-03-29 | NeSLAM: Neural Implicit Mapping and Self-Supervised Feature Tracking With Depth Completion and Denoising | Tianchen Deng et.al. | 2403.20034v1 | link |
2024-03-28 | CoherentGS: Sparse Novel View Synthesis with Coherent 3D Gaussians | Avinash Paliwal et.al. | 2403.19495v1 | null |
2024-03-30 | Total-Decom: Decomposed 3D Scene Reconstruction with Minimal Interaction | Xiaoyang Lyu et.al. | 2403.19314v2 | link |
2024-03-28 | Neural Fields for 3D Tracking of Anatomy and Surgical Instruments in Monocular Laparoscopic Video Clips | Beerend G. A. Gerats et.al. | 2403.19265v1 | null |
2024-04-01 | WALT3D: Generating Realistic Training Data from Time-Lapse Imagery for Reconstructing Dynamic Objects under Occlusion | Khiem Vuong et.al. | 2403.19022v2 | null |
2024-03-29 | Gamba: Marry Gaussian Splatting with Mamba for single view 3D reconstruction | Qiuhong Shen et.al. | 2403.18795v2 | null |
2024-03-29 | SplatFace: Gaussian Splat Face Reconstruction Leveraging an Optimizable Surface | Jiahao Luo et.al. | 2403.18784v2 | null |
2024-03-27 | Breaking the Limitations with Sparse Inputs by Variational Frameworks (BLIss) in Terahertz Super-Resolution 3D Reconstruction | Yiyao Zhang et.al. | 2403.18776v1 | link |
2024-03-27 | SAT-NGP : Unleashing Neural Graphics Primitives for Fast Relightable Transient-Free 3D reconstruction from Satellite Imagery | Camille Billouard et.al. | 2403.18711v1 | null |
2024-03-26 | EgoLifter: Open-world 3D Segmentation for Egocentric Perception | Qiao Gu et.al. | 2403.18118v1 | null |
2024-03-25 | Creating a Digital Twin of Spinal Surgery: A Proof of Concept | Jonas Hein et.al. | 2403.16736v1 | null |
2024-03-25 | Spike-NeRF: Neural Radiance Field Based On Spike Camera | Yijia Guo et.al. | 2403.16410v1 | null |
2024-03-25 | Elite360D: Towards Efficient 360 Depth Estimation via Semantic- and Distance-Aware Bi-Projection Fusion | Hao Ai et.al. | 2403.16376v1 | null |
2024-03-24 | latentSplat: Autoencoding Variational Gaussians for Fast Generalizable 3D Reconstruction | Christopher Wewer et.al. | 2403.16292v1 | null |
2024-03-23 | UPNeRF: A Unified Framework for Monocular 3D Object Reconstruction and Pose Estimation | Yuliang Guo et.al. | 2403.15705v1 | null |
2024-03-22 | FastCAD: Real-Time CAD Retrieval and Alignment from Scans and Videos | Florian Langer et.al. | 2403.15161v1 | null |
2024-03-22 | Recent Trends in 3D Reconstruction of General Non-Rigid Scenes | Raza Yunus et.al. | 2403.15064v1 | null |
2024-03-21 | Hyperspectral Neural Radiance Fields | Gerry Chen et.al. | 2403.14839v1 | null |
2024-03-21 | GRM: Large Gaussian Reconstruction Model for Efficient 3D Reconstruction and Generation | Yinghao Xu et.al. | 2403.14621v1 | link |
2024-03-21 | Isotropic Gaussian Splatting for Real-Time Radiance Field Rendering | Yuanhao Gong et.al. | 2403.14244v1 | null |
2024-03-21 | Leveraging Thermal Modality to Enhance Reconstruction in Low-Light Conditions | Jiacong Xu et.al. | 2403.14053v1 | null |
2024-03-20 | T-Pixel2Mesh: Combining Global and Local Transformer for 3D Mesh Generation from a Single Image | Shijie Zhang et.al. | 2403.13663v1 | null |
2024-03-20 | MULAN-WC: Multi-Robot Localization Uncertainty-aware Active NeRF with Wireless Coordination | Weiying Wang et.al. | 2403.13348v1 | null |
2024-03-19 | GVGEN: Text-to-3D Generation with Volumetric Representation | Xianglong He et.al. | 2403.12957v1 | null |
2024-03-19 | PostoMETRO: Pose Token Enhanced Mesh Transformer for Robust 3D Human Mesh Recovery | Wendi Yang et.al. | 2403.12473v1 | null |
2024-03-18 | LN3Diff: Scalable Latent Neural Fields Diffusion for Speedy 3D Generation | Yushi Lan et.al. | 2403.12019v1 | null |
2024-03-18 | GeoWizard: Unleashing the Diffusion Priors for 3D Geometry Estimation from a Single Image | Xiao Fu et.al. | 2403.12013v1 | null |
2024-03-18 | SV3D: Novel Multi-view Synthesis and 3D Generation from a Single Image using Latent Video Diffusion | Vikram Voleti et.al. | 2403.12008v1 | null |
2024-03-18 | GNeRP: Gaussian-guided Neural Reconstruction of Reflective Objects with Noisy Polarization Priors | LI Yang et.al. | 2403.11899v1 | null |
2024-03-18 | OpenOcc: Open Vocabulary 3D Scene Reconstruction via Occupancy Representation | Haochen Jiang et.al. | 2403.11796v1 | null |
2024-03-18 | Fed3DGS: Scalable 3D Gaussian Splatting with Federated Learning | Teppei Suzuki et.al. | 2403.11460v1 | link |
2024-03-18 | BAGS: Building Animatable Gaussian Splatting from a Monocular Video with Diffusion Priors | Tingyang Zhang et.al. | 2403.11427v1 | null |
2024-03-17 | Creating Seamless 3D Maps Using Radiance Fields | Sai Tarun Sathyan et.al. | 2403.11364v1 | null |
2024-03-17 | Recent Advances in 3D Gaussian Splatting | Tong Wu et.al. | 2403.11134v1 | null |
2024-03-17 | Omni-Recon: Towards General-Purpose Neural Radiance Fields for Versatile 3D Applications | Yonggan Fu et.al. | 2403.11131v1 | null |
2024-03-16 | Ctrl123: Consistent Novel View Synthesis via Closed-Loop Transcription | Hongxiang Zhao et.al. | 2403.10953v1 | null |
2024-03-15 | SCILLA: SurfaCe Implicit Learning for Large Urban Area, a volumetric hybrid solution | Hala Djeghim et.al. | 2403.10344v1 | null |
2024-03-15 | FDGaussian: Fast Gaussian Splatting from Single Image via Geometric-aware Diffusion Model | Qijun Feng et.al. | 2403.10242v1 | null |
2024-03-15 | Den-SOFT: Dense Space-Oriented Light Field DataseT for 6-DOF Immersive Experience | Xiaohang Yu et.al. | 2403.09973v1 | null |
2024-03-14 | MARVIS: Motion & Geometry Aware Real and Virtual Image Segmentation | Jiayi Wu et.al. | 2403.09850v1 | link |
2024-03-14 | Relaxing Accurate Initialization Constraint for 3D Gaussian Splatting | Jaewoo Jung et.al. | 2403.09413v1 | link |
2024-03-13 | 3DFIRES: Few Image 3D REconstruction for Scenes with Hidden Surface | Linyi Jin et.al. | 2403.08768v1 | null |
2024-03-13 | Refractive COLMAP: Refractive Structure-from-Motion Revisited | Mengkun She et.al. | 2403.08640v1 | null |
2024-03-12 | Q-SLAM: Quadric Representations for Monocular SLAM | Chensheng Peng et.al. | 2403.08125v1 | null |
2024-03-11 | Bayesian Diffusion Models for 3D Shape Reconstruction | Haiyang Xu et.al. | 2403.06973v1 | null |
2024-03-08 | DITTO: Dual and Integrated Latent Topologies for Implicit 3D Reconstruction | Jaehyeok Shim et.al. | 2403.05005v1 | null |
2024-03-11 | Efficient LoFTR: Semi-Dense Local Feature Matching with Sparse-Like Speed | Yifan Wang et.al. | 2403.04765v2 | null |
2024-03-08 | Finding Waldo: Towards Efficient Exploration of NeRF Scene Spaces | Evangelos Skartados et.al. | 2403.04508v2 | null |
2024-03-07 | CN-RMA: Combined Network with Ray Marching Aggregation for 3D Indoors Object Detection from Multi-view Images | Guanlin Shen et.al. | 2403.04198v1 | link |
2024-03-05 | Pooling Image Datasets With Multiple Covariate Shift and Imbalance | Sotirios Panagiotis Chytas et.al. | 2403.02598v1 | null |
2024-03-04 | TripoSR: Fast 3D Object Reconstruction from a Single Image | Dmitry Tochilkin et.al. | 2403.02151v1 | link |
2024-03-03 | A Novel Dynamic Light-Section 3D Reconstruction Method for Wide-Range Sensing | Mengjuan Chen et.al. | 2403.01374v1 | null |
2024-03-08 | G3DR: Generative 3D Reconstruction in ImageNet | Pradyumna Reddy et.al. | 2403.00939v2 | link |
2024-03-01 | DISORF: A Distributed Online NeRF Training and Rendering Framework for Mobile Robots | Chunlin Li et.al. | 2403.00228v1 | null |
2024-03-05 | VEnvision3D: A Synthetic Perception Dataset for 3D Multi-Task Model Research | Jiahao Zhou et.al. | 2402.19059v2 | null |
2024-02-27 | Sora Generates Videos with Stunning Geometrical Consistency | Xuanyi Li et.al. | 2402.17403v1 | null |
2024-02-27 | CharNeRF: 3D Character Generation from Concept Art | Eddy Chu et.al. | 2402.17115v1 | null |
2024-02-26 | DreamUp3D: Object-Centric Generative Models for Single-View 3D Scene Understanding and Real-to-Sim Transfer | Yizhe Wu et.al. | 2402.16308v1 | null |
2024-02-25 | GenNBV: Generalizable Next-Best-View Policy for Active 3D Reconstruction | Xiao Chen et.al. | 2402.16174v1 | null |
2024-02-24 | A Generative Machine Learning Model for Material Microstructure 3D Reconstruction and Performance Evaluation | Yilin Zheng et.al. | 2402.15815v1 | null |
2024-02-22 | Cameras as Rays: Pose Estimation via Ray Diffusion | Jason Y. Zhang et.al. | 2402.14817v1 | null |
2024-02-22 | Workspace Analysis for Laparoscopic Rectal Surgery : A Preliminary Study | Alexandra Thomieres et.al. | 2402.14386v1 | null |
2024-02-22 | MVD $^2$ : Efficient Multiview 3D Reconstruction for Multiview Diffusion | Xin-Yang Zheng et.al. | 2402.14253v1 | null |
2024-02-20 | MVDiffusion++: A Dense High-resolution Multi-view Diffusion Model for Single or Sparse-view 3D Object Reconstruction | Shitao Tang et.al. | 2402.12712v1 | null |
2024-02-25 | A Robust Error-Resistant View Selection Method for 3D Reconstruction | Shaojie Zhang et.al. | 2402.11431v2 | null |
2024-02-17 | Dense Matchers for Dense Tracking | Tomáš Jelínek et.al. | 2402.11287v1 | null |
2024-02-17 | DiffPoint: Single and Multi-view Point Cloud Reconstruction with ViT Based Diffusion Model | Yu Feng et.al. | 2402.11241v1 | null |
2024-02-15 | Evaluating NeRFs for 3D Plant Geometry Reconstruction in Field Conditions | Muhammad Arbab Arshad et.al. | 2402.10344v1 | null |
2024-02-15 | GES: Generalized Exponential Splatting for Efficient Radiance Field Rendering | Abdullah Hamdi et.al. | 2402.10128v1 | link |
2024-02-14 | PC-NeRF: Parent-Child Neural Radiance Fields Using Sparse LiDAR Frames in Autonomous Driving Environments | Xiuzhong Hu et.al. | 2402.09325v1 | link |
2024-02-14 | DUDF: Differentiable Unsigned Distance Fields with Hyperbolic Scaling | Miguel Fainstein et.al. | 2402.08876v1 | null |
2024-02-13 | IM-3D: Iterative Multiview Diffusion and Reconstruction for High-Quality 3D Generation | Luke Melas-Kyriazi et.al. | 2402.08682v1 | null |
2024-02-20 | Camera Calibration through Geometric Constraints from Rotation and Projection Matrices | Muhammad Waleed et.al. | 2402.08437v2 | link |
2024-02-09 | Neural Rendering based Urban Scene Reconstruction for Autonomous Driving | Shihao Shen et.al. | 2402.06826v1 | null |
2024-02-07 | Carousel phase retrieval algorithm for 3D coherent X-ray diffraction imaging | Fangzhou Ai et.al. | 2402.05283v1 | link |
2024-02-06 | EscherNet: A Generative Model for Scalable View Synthesis | Xin Kong et.al. | 2402.03908v1 | link |
2024-02-09 | MoD-SLAM: Monocular Dense Mapping for Unbounded 3D Scene Reconstruction | Heng Zhou et.al. | 2402.03762v2 | null |
2024-02-05 | Denoising Diffusion via Image-Based Rendering | Titas Anciukevicius et.al. | 2402.03445v1 | null |
2024-02-02 | Di-NeRF: Distributed NeRF for Collaborative Learning with Unknown Relative Poses | Mahboubeh Asadi et.al. | 2402.01485v1 | null |
2024-02-02 | DeepAAT: Deep Automated Aerial Triangulation for Fast UAV-based Mapping | Zequan Chen et.al. | 2402.01134v1 | link |
2024-02-01 | Enhanced fringe-to-phase framework using deep learning | Won-Hoe Kim et.al. | 2402.00977v1 | null |
2024-02-01 | Diffusion-based Light Field Synthesis | Ruisheng Gao et.al. | 2402.00575v1 | null |
2024-01-31 | Local Feature Matching Using Deep Learning: A Survey | Shibiao Xu et.al. | 2401.17592v1 | link |
2024-01-30 | Self-Supervised Representation Learning for Nerve Fiber Distribution Patterns in 3D-PLI | Alexander Oberstrass et.al. | 2401.17207v1 | null |
2024-01-30 | Physical Priors Augmented Event-Based 3D Reconstruction | Jiaxu Wang et.al. | 2401.17121v1 | link |
2024-01-30 | OmniSCV: An Omnidirectional Synthetic Image Generator for Computer Vision | Bruno Berenguel-Baeta et.al. | 2401.17061v1 | link |
2024-01-29 | Domain adaptation strategies for 3D reconstruction of the lumbar spine using real fluoroscopy data | Sascha Jecklin et.al. | 2401.16027v1 | null |
2024-01-29 | 2L3: Lifting Imperfect Generated 2D Images into Accurate 3D | Yizheng Chen et.al. | 2401.15841v1 | null |
2024-01-28 | Multi-Person 3D Pose Estimation from Multi-View Uncalibrated Depth Cameras | Yu-Jhe Li et.al. | 2401.15616v1 | null |
2024-01-26 | 3D Reconstruction and New View Synthesis of Indoor Environments based on a Dual Neural Radiance Field | Zhenyu Bao et.al. | 2401.14726v1 | link |
2024-01-25 | TIFu: Tri-directional Implicit Function for High-Fidelity 3D Character Reconstruction | Byoungsung Lim et.al. | 2401.14565v1 | null |
2024-01-25 | Range-Agnostic Multi-View Depth Estimation With Keyframe Selection | Andrea Conti et.al. | 2401.14401v1 | link |
2024-01-25 | pix2gestalt: Amodal Segmentation by Synthesizing Wholes | Ege Ozguroglu et.al. | 2401.14398v1 | link |
2024-01-25 | GauU-Scene: A Scene Reconstruction Benchmark on Large Scale 3D Reconstruction Dataset Using Gaussian Splatting | Butian Xiong et.al. | 2401.14032v1 | null |
2024-01-24 | EndoGaussians: Single View Dynamic Gaussian Splatting for Deformable Endoscopic Tissues Reconstruction | Yangsen Chen et.al. | 2401.13352v1 | null |
2024-01-23 | IRIS: Inverse Rendering of Indoor Scenes from Low Dynamic Range Images | Zhi-Hao Lin et.al. | 2401.12977v1 | null |
2024-01-21 | A Survey on African Computer Vision Datasets, Topics and Researchers | Abdul-Hakeem Omotayo et.al. | 2401.11617v1 | null |
2024-01-21 | Multi-View Neural 3D Reconstruction of Micro-/Nanostructures with Atomic Force Microscopy | Shuo Chen et.al. | 2401.11541v1 | link |
2024-01-21 | Deformable Endoscopic Tissues Reconstruction with Gaussian Splatting | Lingting Zhu et.al. | 2401.11535v1 | link |
2024-01-17 | POE: Acoustic Soft Robotic Proprioception for Omnidirectional End-effectors | Uksang Yoo et.al. | 2401.09382v1 | null |
2024-01-16 | Learning Implicit Representation for Reconstructing Articulated Objects | Hao Zhang et.al. | 2401.08809v1 | null |
2024-01-20 | Real3D-Portrait: One-shot Realistic 3D Talking Portrait Synthesis | Zhenhui Ye et.al. | 2401.08503v2 | null |
2024-01-16 | S3M: Semantic Segmentation Sparse Mapping for UAVs with RGB-D Camera | Thanh Nguyen Canh et.al. | 2401.08134v1 | null |
2024-01-12 | 3D Reconstruction of Interacting Multi-Person in Clothing from a Single Image | Junuk Cha et.al. | 2401.06415v1 | null |
2024-01-12 | SD-MVS: Segmentation-Driven Deformation Multi-View Stereo with Spherical Refinement and EM optimization | Zhenlong Yuan et.al. | 2401.06385v1 | null |
2024-01-12 | Surgical-DINO: Adapter Learning of Foundation Models for Depth Estimation in Endoscopic Surgery | Beilei Cui et.al. | 2401.06013v2 | link |
2024-01-10 | Structure from Duplicates: Neural Inverse Graphics from a Pile of Objects | Tianhang Cheng et.al. | 2401.05236v1 | link |
2024-01-07 | RHOBIN Challenge: Reconstruction of Human Object Interaction | Xianghui Xie et.al. | 2401.04143v1 | null |
2024-01-08 | AGG: Amortized Generative 3D Gaussians for Single Image to 3D | Dejia Xu et.al. | 2401.04099v1 | null |
2024-01-08 | A Survey on 3D Gaussian Splatting | Guikun Chen et.al. | 2401.03890v1 | null |
2024-01-03 | S3Net: Innovating Stereo Matching and Semantic Segmentation with a Single-Branch Semantic Stereo Network in Satellite Epipolar Imagery | Qingyuan Yang et.al. | 2401.01643v1 | link |
2023-12-29 | Informative Rays Selection for Few-Shot Neural Radiance Fields | Marco Orsingher et.al. | 2312.17561v1 | null |
2023-12-28 | Toward Semantic Scene Understanding for Fine-Grained 3D Modeling of Plants | Mohamad Qadri et.al. | 2312.17110v1 | null |
2023-12-28 | Learning Spatially Collaged Fourier Bases for Implicit Neural Representation | Jason Chun Lok Li et.al. | 2312.17018v1 | null |
2023-12-27 | In-Hand 3D Object Reconstruction from a Monocular RGB Video | Shijian Jiang et.al. | 2312.16425v1 | null |
2023-12-24 | SUNDIAL: 3D Satellite Understanding through Direct, Ambient, and Complex Lighting Decomposition | Nikhil Behari et.al. | 2312.16215v1 | null |
2023-12-24 | A theory of volumetric representations for opaque solids | Bailey Miller et.al. | 2312.15406v1 | null |
2023-12-22 | Pola4All: survey of polarimetric applications and an open-source toolkit to analyze polarization | Joaquin Rodriguez et.al. | 2312.14697v1 | link |
2023-12-22 | Scalable 3D Reconstruction From Single Particle X-Ray Diffraction Images Based on Online Machine Learning | Jay Shenoy et.al. | 2312.14432v1 | null |
2023-12-21 | PlatoNeRF: 3D Reconstruction in Plato’s Cave via Single-View Two-Bounce Lidar | Tzofi Klinghoffer et.al. | 2312.14239v1 | null |
2023-12-21 | 3D Pose Estimation of Two Interacting Hands from a Monocular Event Camera | Christen Millerdurai et.al. | 2312.14157v1 | null |
2023-12-21 | DUSt3R: Geometric 3D Vision Made Easy | Shuzhe Wang et.al. | 2312.14132v1 | link |
2023-12-21 | Anatomical basis of sex differences in human post-myocardial infarction ECG phenotypes identified by novel automated torso-cardiac 3D reconstruction | Hannah J. Smith et.al. | 2312.13976v1 | null |
2023-12-21 | SyncDreamer for 3D Reconstruction of Endangered Animal Species with NeRF and NeuS | Ahmet Haydar Ornek et.al. | 2312.13832v1 | null |
2023-12-21 | Visual Tomography: Physically Faithful Volumetric Models of Partially Translucent Objects | David Nakath et.al. | 2312.13494v1 | null |
2023-12-20 | UniSDF: Unifying Neural Representations for High-Fidelity 3D Reconstruction of Complex Scenes with Reflections | Fangjinhua Wang et.al. | 2312.13285v1 | null |
2023-12-20 | Splatter Image: Ultra-Fast Single-View 3D Reconstruction | Stanislaw Szymanowicz et.al. | 2312.13150v1 | link |
2023-12-21 | pixelSplat: 3D Gaussian Splats from Image Pairs for Scalable Generalizable 3D Reconstruction | David Charatan et.al. | 2312.12337v2 | link |
2023-12-19 | EVI-SAM: Robust, Real-time, Tightly-coupled Event-Visual-Inertial State Estimation and 3D Dense Mapping | Weipeng Guan et.al. | 2312.11911v1 | link |
2023-12-17 | Primitive-based 3D Human-Object Interaction Modelling and Programming | Siqi Liu et.al. | 2312.10714v1 | null |
2023-12-16 | Triplane Meets Gaussian Splatting: Fast and Generalizable Single-View 3D Reconstruction with Transformers | Zi-Xin Zou et.al. | 2312.09147v2 | null |
2023-12-14 | Living Scenes: Multi-object Relocalization and Reconstruction in Changing 3D Environments | Liyuan Zhu et.al. | 2312.09138v1 | null |
2023-12-14 | Scene 3-D Reconstruction System in Scattering Medium | Zhuoyifan Zhang et.al. | 2312.09005v1 | null |
2023-12-11 | Gaussian Splatting SLAM | Hidenobu Matsuki et.al. | 2312.06741v1 | null |
2023-12-10 | UNeR3D: Versatile and Scalable 3D RGB Point Cloud Generation from 2D Images in Unsupervised Reconstruction | Hongbin Lin et.al. | 2312.06706v1 | null |
2023-12-10 | SuperPrimitive: Scene Reconstruction at a Primitive Level | Kirill Mazur et.al. | 2312.05889v1 | null |
2023-12-11 | Nuvo: Neural UV Mapping for Unruly 3D Representations | Pratul P. Srinivasan et.al. | 2312.05283v1 | null |
2023-12-08 | Fine Dense Alignment of Image Bursts through Camera Pose and Depth Estimation | Bruno Lecouat et.al. | 2312.05190v1 | null |
2023-12-08 | SuperNormal: Neural Surface Reconstruction via Multi-View Normal Integration | Xu Cao et.al. | 2312.04803v1 | null |
2023-12-07 | FitDiff: Robust monocular 3D facial shape and reflectance estimation using Diffusion Models | Stathis Galanakis et.al. | 2312.04465v1 | null |
2023-12-06 | Gaussian-Flow: 4D Reconstruction with Dynamic 3D Gaussian Particle | Youtian Lin et.al. | 2312.03431v1 | null |
2023-12-06 | Evaluating the point cloud of individual trees generated from images based on Neural Radiance fields (NeRF) method | Hongyu Huang et.al. | 2312.03372v1 | null |
2023-12-06 | RING-NeRF: A Versatile Architecture based on Residual Implicit Neural Grids | Doriand Petit et.al. | 2312.03357v1 | null |
2023-12-05 | ReconFusion: 3D Reconstruction with Diffusion Priors | Rundi Wu et.al. | 2312.02981v1 | null |
2023-12-05 | R3D-SWIN:Use Shifted Window Attention for Single-View 3D Reconstruction | Chenhuan Li et.al. | 2312.02725v1 | null |
2023-12-05 | DreaMo: Articulated 3D Reconstruction From A Single Casual Video | Tao Tu et.al. | 2312.02617v1 | null |
2023-12-05 | Prompt2NeRF-PIL: Fast NeRF Generation via Pretrained Implicit Latent | Jianmeng Liu et.al. | 2312.02568v1 | null |
2023-12-03 | Slice3D: Multi-Slice, Occlusion-Revealing, Single View 3D Reconstruction | Yizhi Wang et.al. | 2312.02221v1 | null |
2023-12-04 | Steerers: A framework for rotation equivariant keypoint descriptors | Georg Bökman et.al. | 2312.02152v1 | link |
2023-12-04 | iMatching: Imperative Correspondence Learning | Zitong Zhan et.al. | 2312.02141v1 | null |
2023-12-04 | Light Field Imaging in the Restrictive Object Space based on Flexible Angular Plane | Ping Zhou et.al. | 2312.01761v1 | null |
2023-12-02 | RNb-NeuS: Reflectance and Normal-based Multi-View 3D Reconstruction | Baptiste Brument et.al. | 2312.01215v1 | link |
2023-12-05 | Self-Evolving Neural Radiance Fields | Jaewoo Jung et.al. | 2312.01003v2 | link |
2023-12-01 | NeuSG: Neural Implicit Surface Reconstruction with 3D Gaussian Splatting Guidance | Hanlin Chen et.al. | 2312.00846v1 | null |
2023-12-01 | UAVs and Birds: Enhancing Short-Range Navigation through Budgerigar Flight Studies | Md. Mahmudur Rahman et.al. | 2312.00597v1 | null |
2023-11-30 | Learning One-Shot 4D Head Avatar Synthesis using Synthetic Data | Yu Deng et.al. | 2311.18729v1 | null |
2023-11-30 | Multi-task learning with cross-task consistency for improved depth estimation in colonoscopy | Pedro Esteban Chavarrias Solano et.al. | 2311.18664v1 | null |
2023-11-30 | HOLD: Category-agnostic 3D Reconstruction of Interacting Hands and Objects from Video | Zicong Fan et.al. | 2311.18448v1 | link |
2023-11-29 | Volumetric Cloud Field Reconstruction | Jacob Lin et.al. | 2311.17657v1 | null |
2023-11-30 | REF $^2$ -NeRF: Reflection and Refraction aware Neural Radiance Field | Wooseok Kim et.al. | 2311.17116v2 | link |
2023-11-28 | Multi-Scale 3D Gaussian Splatting for Anti-Aliased Rendering | Zhiwen Yan et.al. | 2311.17089v1 | null |
2023-11-28 | Surf-D: High-Quality Surface Generation for Arbitrary Topologies using Diffusion Models | Zhengming Yu et.al. | 2311.17050v1 | null |
2023-11-28 | Gradient-based Local Next-best-view Planning for Improved Perception of Targeted Plant Nodes | Akshay K. Burusa et.al. | 2311.16759v1 | null |
2023-11-28 | RGBGrasp: Image-based Object Grasping by Capturing Multiple Views during Robot Arm Movement with Neural Radiance Fields | Chang Liu et.al. | 2311.16592v1 | null |
2023-11-28 | Rethinking Directional Integration in Neural Radiance Fields | Congyue Deng et.al. | 2311.16504v1 | null |
2023-11-27 | Weakly-Supervised 3D Reconstruction of Clothed Humans via Normal Maps | Jane Wu et.al. | 2311.16042v1 | null |
2023-11-27 | SiTH: Single-view Textured Human Reconstruction with Image-Conditioned Diffusion | Hsuan-I Ho et.al. | 2311.15855v1 | null |
2023-11-26 | Obj-NeRF: Extract Object NeRFs from Multi-view Images | Zhiyi Li et.al. | 2311.15291v1 | null |
2023-11-25 | Multi-task Planar Reconstruction with Feature Warping Guidance | Luan Wei et.al. | 2311.14981v1 | link |
2023-11-24 | RSB-Pose: Robust Short-Baseline Binocular 3D Human Pose Estimation with Occlusion Handling | Xiaoyue Wan et.al. | 2311.14242v1 | null |
2023-11-23 | GigaPose: Fast and Robust Novel Object Pose Estimation via One Correspondence | Van Nguyen Nguyen et.al. | 2311.14155v1 | link |
2023-11-23 | MonoNav: MAV Navigation via Monocular Depth Estimation and Reconstruction | Nathaniel Simon et.al. | 2311.14100v1 | null |
2023-11-23 | DRIFu: Differentiable Rendering and Implicit Function-based Single-View 3D Reconstruction | Zijian Kuang et.al. | 2311.13199v2 | link |
2023-11-22 | Differentiable Radio Frequency Ray Tracing for Millimeter-Wave Sensing | Xingyu Chen et.al. | 2311.13182v1 | null |
2023-11-21 | Physics-guided Shape-from-Template: Monocular Video Perception through Neural Surrogate Models | David Stotko et.al. | 2311.12796v1 | link |
2023-11-20 | Mixing-Denoising Generalizable Occupancy Networks | Amine Ouasfi et.al. | 2311.12125v1 | null |
2023-11-23 | PF-LRM: Pose-Free Large Reconstruction Model for Joint Pose and Shape Prediction | Peng Wang et.al. | 2311.12024v2 | null |
2023-11-19 | GaussianDiffusion: 3D Gaussian Splatting for Denoising Diffusion Probabilistic Models with Structured Noise | Xinhai Li et.al. | 2311.11221v1 | null |
2023-11-18 | LOSTU: Fast, Scalable, and Uncertainty-Aware Triangulation | Sébastien Henry et.al. | 2311.11171v1 | null |
2023-11-18 | Invariant-based Mapping of Space During General Motion of an Observer | Juan D. Yepes et.al. | 2311.11130v1 | null |
2023-11-16 | DSR-Diff: Depth Map Super-Resolution with Diffusion Model | Yuan Shi et.al. | 2311.09919v1 | null |
2023-11-18 | EvaSurf: Efficient View-Aware Implicit Textured Surface Reconstruction on Mobile Devices | Jingnan Gao et.al. | 2311.09806v2 | null |
2023-11-14 | DynamicSurf: Dynamic Neural RGB-D Surface Reconstruction with an Optimizable Feature Grid | Mirgahney Mohamed et.al. | 2311.08159v1 | null |
2023-11-13 | $L_0$-Sampler: An $L_{0}$ Model Guided Volume Sampling for NeRF | Liangchen Li et.al. | 2311.07044v1 | null |
2023-11-11 | 3DFusion, A real-time 3D object reconstruction pipeline based on streamed instance segmented data | Xi Sun et.al. | 2311.06659v1 | null |
2023-11-09 | ConRad: Image Constrained Radiance Fields for 3D Generation from a Single Image | Senthil Purushwalkam et.al. | 2311.05230v1 | null |
2023-11-08 | Implicit Neural Representations for Breathing-compensated Volume Reconstruction in Robotic Ultrasound Aorta Screening | Yordanka Velikova et.al. | 2311.04999v1 | null |
2023-11-08 | LRM: Large Reconstruction Model for Single Image to 3D | Yicong Hong et.al. | 2311.04400v1 | null |
2023-11-07 | High-fidelity 3D Reconstruction of Plants using Neural Radiance Field | Kewei Hu et.al. | 2311.04154v1 | null |
2023-11-07 | DeepPatent2: A Large-Scale Benchmarking Corpus for Technical Drawing Understanding | Kehinde Ajayi et.al. | 2311.04098v1 | link |
2023-11-05 | MuSHRoom: Multi-Sensor Hybrid Room Dataset for Joint 3D Reconstruction and Novel View Synthesis | Xuqian Ren et.al. | 2311.02778v1 | null |
2023-11-05 | Fast Point-cloud to Mesh Reconstruction for Deformable Object Tracking | Elham Amin Mansour et.al. | 2311.02749v1 | null |
2023-11-05 | IPVNet: Learning Implicit Point-Voxel Features for Open-Surface 3D Reconstruction | Mohammad Samiul Arshad et.al. | 2311.02552v1 | link |
2023-11-02 | CADSim: Robust and Scalable in-the-wild 3D Reconstruction for Controllable Sensor Simulation | Jingkang Wang et.al. | 2311.01447v1 | null |
2023-11-02 | Look at Robot Base Once: Hand-Eye Calibration with Point Clouds of Robot Base Leveraging Learning-Based 3D Vision | Leihui Li et.al. | 2311.01335v1 | link |
2023-11-02 | Joint 3D Shape and Motion Estimation from Rolling Shutter Light-Field Images | Hermes McGriff et.al. | 2311.01292v1 | link |
2023-11-01 | Single-view 3D Scene Reconstruction with High-fidelity Shape and Texture | Yixin Chen et.al. | 2311.00457v1 | null |
2023-10-31 | Deep Compressed Learning for 3D Seismic Inversion | Maayan Gelboim et.al. | 2311.00107v1 | null |
2023-10-31 | Refined Equivalent Pinhole Model for Large-scale 3D Reconstruction from Spaceborne CCD Imagery | Hong Danyang et.al. | 2310.20117v1 | null |
2023-10-29 | 3DMiner: Discovering Shapes from Large-Scale Unannotated Image Datasets | Ta-Ying Cheng et.al. | 2310.19188v1 | null |
2023-10-25 | Open-NeRF: Towards Open Vocabulary NeRF Decomposition | Hao Zhang et.al. | 2310.16383v1 | null |
2023-10-23 | Novel-View Acoustic Synthesis from 3D Reconstructed Rooms | Byeongjoo Ahn et.al. | 2310.15130v1 | link |
2023-10-23 | Interaction-Driven Active 3D Reconstruction with Object Interiors | Zihao Yan et.al. | 2310.14700v1 | null |
2023-10-23 | VQ-NeRF: Vector Quantization Enhances Implicit Neural Representations | Yiying Yang et.al. | 2310.14487v1 | null |
2023-10-22 | A Quantitative Evaluation of Dense 3D Reconstruction of Sinus Anatomy from Monocular Endoscopic Video | Jan Emily Mangulabnan et.al. | 2310.14364v1 | null |
2023-10-20 | Single-view 3D reconstruction via inverse procedural modeling | Albert Garifullin et.al. | 2310.13373v1 | null |
2023-10-20 | UE4-NeRF:Neural Radiance Field for Real-Time Rendering of Large-Scale Scene | Jiaming Gu et.al. | 2310.13263v1 | null |
2023-10-19 | Real space iterative reconstruction for vector tomography (RESIRE-V) | Minh Pham et.al. | 2310.12513v1 | link |
2023-10-18 | ShapeGraFormer: GraFormer-Based Network for Hand-Object Reconstruction from a Single Depth Map | Ahmed Tawfik Aboukhadra et.al. | 2310.11811v1 | null |
2023-10-17 | Learning Neural Implicit through Volume Rendering with Attentive Depth Fusion Priors | Pengchong Hu et.al. | 2310.11598v1 | null |
2023-10-17 | Field Robot for High-throughput and High-resolution 3D Plant Phenotyping | Felix Esser et.al. | 2310.11516v1 | null |
2023-10-16 | In-Situ Single Particle Reconstruction Reveals 3D Evolution of PtNi Nanocatalysts During Heating | Yi-Chi Wang et.al. | 2310.10253v1 | null |
2023-10-15 | Tabletop Transparent Scene Reconstruction via Epipolar-Guided Optical Flow with Monocular Depth Completion Prior | Xiaotong Chen et.al. | 2310.09956v1 | null |
2023-10-15 | CBARF: Cascaded Bundle-Adjusting Neural Radiance Fields from Imperfect Camera Poses | Hongyu Fu et.al. | 2310.09776v1 | null |
2023-10-12 | Implicit Shape and Appearance Priors for Few-Shot Full Head Reconstruction | Pol Caselles et.al. | 2310.08784v1 | null |
2023-10-13 | PonderV2: Pave the Way for 3D Foundation Model with A Universal Pre-training Paradigm | Haoyi Zhu et.al. | 2310.08586v2 | link |
2023-10-12 | Consistent123: Improve Consistency for One Image to 3D Object Synthesis | Haohan Weng et.al. | 2310.08092v1 | null |
2023-10-10 | SketchBodyNet: A Sketch-Driven Multi-faceted Decoder Network for 3D Human Reconstruction | Fei Wang et.al. | 2310.06577v1 | link |
2023-10-08 | Experiences with CAMRE: Single-Device Collaborative Adaptive Mixed Reality Environment | Hung-Jui Guo et.al. | 2310.04996v1 | null |
2023-10-02 | PC-NeRF: Parent-Child Neural Radiance Fields under Partial Sensor Data Loss in Autonomous Driving Environments | Xiuzhong Hu et.al. | 2310.00874v1 | link |
2023-10-01 | Enabling Neural Radiance Fields (NeRF) for Large-scale Aerial Images – A Multi-tiling Approaching and the Geometry Assessment of NeRF | Ningli Xu et.al. | 2310.00530v1 | null |
2023-09-29 | 3D Reconstruction in Noisy Agricultural Environments: A Bayesian Optimization Perspective for View Planning | Athanasios Bacharis et.al. | 2310.00145v1 | null |
2023-09-29 | Effect of structure-based training on 3D localization precision and quality | Armin Abdehkakha et.al. | 2309.17265v1 | null |
2023-09-28 | Sketch2CADScript: 3D Scene Reconstruction from 2D Sketch using Visual Transformer and Rhino Grasshopper | Hong-Bin Yang et.al. | 2309.16850v1 | null |
2023-09-29 | 3D Reconstruction with Generalizable Neural Fields using Scene Priors | Yang Fu et.al. | 2309.15164v2 | null |
2023-09-26 | Combining optical diffraction tomography with imaging flow cytometry for characterizing morphology, hemoglobin content, and membrane deformability of live red blood cells | Yu-Hsiang Chang et.al. | 2309.15131v1 | null |
2023-09-26 | PHRIT: Parametric Hand Representation with Implicit Template | Zhisheng Huang et.al. | 2309.14916v1 | null |
2023-09-26 | Unsupervised Reconstruction of 3D Human Pose Interactions From 2D Poses Alone | Peter Hardy et.al. | 2309.14865v1 | null |
2023-09-26 | 3D Density-Gradient based Edge Detection on Neural Radiance Fields (NeRFs) for Geometric Reconstruction | Miriam Jäger et.al. | 2309.14800v1 | null |
2023-09-23 | MP-MVS: Multi-Scale Windows PatchMatch and Planar Prior Multi-View Stereo | Rongxuan Tan et.al. | 2309.13294v1 | link |
2023-09-22 | NeRRF: 3D Reconstruction and View Synthesis for Transparent and Specular Objects with Neural Refractive-Reflective Fields | Xiaoxue Chen et.al. | 2309.13039v1 | link |
2023-09-25 | Language-driven Object Fusion into Neural Radiance Fields with Pose-Conditioned Dataset Updates | Ka Chun Shum et.al. | 2309.11281v2 | link |
2023-09-19 | PLVS: A SLAM System with Points, Lines, Volumetric Mapping, and 3D Incremental Segmentation | Luigi Freda et.al. | 2309.10896v1 | link |
2023-09-19 | SHOWMe: Benchmarking Object-agnostic Hand-Object 3D Reconstruction | Anilkumar Swamy et.al. | 2309.10748v1 | null |
2023-09-18 | Improving Neural Indoor Surface Reconstruction with Mask-Guided Adaptive Consistency Constraints | Xinyi Yu et.al. | 2309.09739v1 | null |
2023-09-18 | Robust Geometry-Preserving Depth Estimation Using Differentiable Rendering | Chi Zhang et.al. | 2309.09724v1 | null |
2023-09-17 | Uncertainty-aware 3D Object-Level Mapping with Deep Shape Priors | Ziwei Liao et.al. | 2309.09118v1 | null |
2023-09-13 | Exploiting Multiple Priors for Neural 3D Indoor Reconstruction | Federico Lincetto et.al. | 2309.07021v1 | null |
2023-09-12 | Semantic and Articulated Pedestrian Sensing Onboard a Moving Vehicle | Maria Priisalu et.al. | 2309.06313v1 | null |
2023-09-11 | A survey on real-time 3D scene reconstruction with SLAM methods in embedded systems | Quentin Picard et.al. | 2309.05349v1 | null |
2023-09-07 | A Food Package Recognition and Sorting System Based on Structured Light and Deep Learning | Xuanzhi Liu et.al. | 2309.03704v1 | null |
2023-09-06 | SADIR: Shape-Aware Diffusion Models for 3D Image Reconstruction | Nivetha Jayakumar et.al. | 2309.03335v1 | null |
2023-09-06 | Sparse 3D Reconstruction via Object-Centric Ray Sampling | Llukman Cerkezi et.al. | 2309.03008v1 | link |
2023-09-06 | Multi-log grasping using reinforcement learning and virtual visual servoing | Erik Wallin et.al. | 2309.02997v1 | null |
2023-09-06 | LightNeuS: Neural Surface Reconstruction in Endoscopy using Illumination Decline | Víctor M. Batlle et.al. | 2309.02777v1 | null |
2023-09-05 | GO-SLAM: Global Optimization for Consistent 3D Instant Reconstruction | Youmin Zhang et.al. | 2309.02436v1 | link |
2023-09-05 | Doppelgangers: Learning to Disambiguate Images of Similar Structures | Ruojin Cai et.al. | 2309.02420v1 | link |
2023-09-05 | TiAVox: Time-aware Attenuation Voxels for Sparse-view 4D DSA Reconstruction | Zhenghong Zhou et.al. | 2309.02318v1 | null |
2023-09-05 | Iterative Superquadric Recomposition of 3D Objects from Multiple Views | Stephan Alaniz et.al. | 2309.02102v1 | link |
2023-09-01 | Dense Voxel 3D Reconstruction Using a Monocular Event Camera | Haodong Chen et.al. | 2309.00385v1 | null |
2023-08-24 | Improving NeRF Quality by Progressive Camera Placement for Unrestricted Navigation in Complex Environments | Georgios Kopanas et.al. | 2309.00014v1 | null |
2023-08-29 | Intensity correlation holography for remote phase sensing and 3D imaging | Guillaume Thekkadath et.al. | 2308.15619v1 | null |
2023-08-28 | R3D3: Dense 3D Reconstruction of Dynamic Scenes from Multiple Cameras | Aron Schmied et.al. | 2308.14713v1 | null |
2023-08-27 | Sparse3D: Distilling Multiview-Consistent Diffusion for Object Reconstruction from Sparse Views | Zi-Xin Zou et.al. | 2308.14078v1 | null |
2023-08-26 | HoloPOCUS: Portable Mixed-Reality 3D Ultrasound Tracking, Reconstruction and Overlay | Kian Wei Ng et.al. | 2308.13823v1 | null |
2023-08-25 | Textureless Deformable Surface Reconstruction with Invisible Markers | Xinyuan Li et.al. | 2308.13678v1 | null |
2023-08-23 | ARF-Plus: Controlling Perceptual Factors in Artistic Radiance Fields for 3D Scene Stylization | Wenzhao Li et.al. | 2308.12452v1 | null |
2023-08-21 | Coordinate Quantized Neural Implicit Representations for Multi-view Reconstruction | Sijia Jiang et.al. | 2308.11025v1 | link |
2023-08-19 | Root Pose Decomposition Towards Generic Non-rigid 3D Reconstruction with Monocular Videos | Yikai Wang et.al. | 2308.10089v1 | null |
2023-08-19 | TSAR-MVS: Textureless-aware Segmentation and Correlative Refinement Guided Multi-View Stereo | Zhenlong Yuan et.al. | 2308.09990v1 | null |
2023-08-19 | A Theory of Topological Derivatives for Inverse Rendering of Geometry | Ishit Mehta et.al. | 2308.09865v1 | null |
2023-08-18 | O^2-Recon: Completing 3D Reconstruction of Occluded Objects in the Scene with a Pre-trained 2D Diffusion Model | Yubin Hu et.al. | 2308.09591v1 | link |
2023-08-17 | A Fusion of Variational Distribution Priors and Saliency Map Replay for Continual 3D Reconstruction | Sanchar Palit et.al. | 2308.08812v1 | null |
2023-08-17 | Long-Range Grouping Transformer for Multi-View 3D Reconstruction | Liying Yang et.al. | 2308.08724v1 | link |
2023-08-16 | DeDoDe: Detect, Don’t Describe – Describe, Don’t Detect for Local Feature Matching | Johan Edstedt et.al. | 2308.08479v1 | link |
2023-08-17 | ObjectSDF++: Improved Object-Compositional Neural Implicit Surfaces | Qianyi Wu et.al. | 2308.07868v2 | link |
2023-08-15 | CCD-3DR: Consistent Conditioning in Diffusion for Single-Image 3D Reconstruction | Yan Di et.al. | 2308.07837v1 | null |
2023-08-15 | Multi-view 3D Face Reconstruction Based on Flame | Wenzhuo Zheng et.al. | 2308.07551v1 | null |
2023-08-14 | A One Stop 3D Target Reconstruction and multilevel Segmentation Method | Jiexiong Xu et.al. | 2308.06974v1 | link |
2023-08-11 | Efficient Large-scale AUV-based Visual Seafloor Mapping | Mengkun She et.al. | 2308.06147v1 | null |
2023-08-10 | PlankAssembly: Robust 3D Reconstruction from Three Orthographic Views with Learnt Shape Programs | Wentao Hu et.al. | 2308.05744v1 | link |
2023-08-10 | HGDNet: A Height-Hierarchy Guided Dual-Decoder Network for Single View Building Extraction and Height Estimation | Chaoran Lu et.al. | 2308.05387v1 | null |
2023-08-07 | Learning Photometric Feature Transform for Free-form Object Scan | Xiang Feng et.al. | 2308.03492v1 | null |
2023-08-04 | Reconstructing Three-Dimensional Models of Interacting Humans | Mihai Fieraru et.al. | 2308.01854v2 | link |
2023-08-02 | HANDAL: A Dataset of Real-World Manipulable Object Categories with Pose Annotations, Affordances, and Reconstructions | Andrew Guo et.al. | 2308.01477v1 | null |
2023-08-15 | Tirtha – An Automated Platform to Crowdsource Images and Create 3D Models of Heritage Sites | Jyotirmaya Shivottam et.al. | 2308.01246v2 | link |
2023-08-02 | Stereo Visual Odometry with Deep Learning-Based Point and Line Feature Matching using an Attention Graph Neural Network | Shenbagaraj Kannapiran et.al. | 2308.01125v1 | null |
2023-08-01 | Body Knowledge and Uncertainty Modeling for Monocular 3D Human Body Reconstruction | Yufei Zhang et.al. | 2308.00799v1 | null |
2023-07-31 | Onboard View Planning of a Flying Camera for High Fidelity 3D Reconstruction of a Moving Actor | Qingyuan Jiang et.al. | 2308.00134v1 | link |
2023-07-21 | Autonomous Electron Tomography Reconstruction with Machine Learning | William Millsaps et.al. | 2308.00099v1 | null |
2023-07-31 | Towards Head Computed Tomography Image Reconstruction Standardization with Deep Learning Assisted Automatic Detection | Bowen Zheng et.al. | 2307.16440v1 | null |
2023-07-27 | FS-Depth: Focal-and-Scale Depth Estimation from a Single Image in Unseen Indoor Scene | Chengrui Wei et.al. | 2307.14624v1 | null |
2023-07-27 | Physically Plausible 3D Human-Scene Reconstruction from Monocular RGB Image using an Adversarial Learning Approach | Sandika Biswas et.al. | 2307.14570v1 | null |
2023-07-27 | Creative Birds: Self-Supervised Single-View 3D Style Transfer | Renke Wang et.al. | 2307.14127v2 | link |
2023-07-24 | CarPatch: A Synthetic Benchmark for Radiance Field Evaluation on Vehicle Components | Davide Di Nucci et.al. | 2307.12718v1 | null |
2023-07-24 | VIRD: Immersive Match Video Analysis for High-Performance Badminton Coaching | Tica Lin et.al. | 2307.12539v1 | link |
2023-07-23 | LIST: Learning Implicitly from Spatial Transformers for Single-View 3D Reconstruction | Mohammad Samiul Arshad et.al. | 2307.12194v1 | link |
2023-07-22 | Replay: Multi-modal Multi-view Acted Videos for Casual Holography | Roman Shapovalov et.al. | 2307.12067v1 | link |
2023-07-20 | SimCol3D – 3D Reconstruction during Colonoscopy Challenge | Anita Rau et.al. | 2307.11261v1 | link |
2023-07-14 | Transient Neural Radiance Fields for Lidar View Synthesis and 3D Reconstruction | Anagh Malik et.al. | 2307.09555v1 | null |
2023-07-18 | NU-MCC: Multiview Compressive Coding with Neighborhood Decoder and Repulsive UDF | Stefan Lionar et.al. | 2307.09112v1 | link |
2023-07-16 | Enforcing Topological Interaction between Implicit Surfaces via Uniform Sampling | Hieu Le et.al. | 2307.08716v1 | null |
2023-07-13 | Bag of Views: An Appearance-based Approach to Next-Best-View Planning for 3D Reconstruction | Sara Hatami Gazani et.al. | 2307.05832v2 | link |
2023-07-11 | 3D detection of roof sections from a single satellite image and application to LOD2-building reconstruction | Johann Lussange et.al. | 2307.05409v1 | null |
2023-07-08 | MAP-NBV: Multi-agent Prediction-guided Next-Best-View Planning for Active 3D Object Reconstruction | Harnaik Dhami et.al. | 2307.04004v1 | null |
2023-07-07 | Depth Estimation Analysis of Orthogonally Divergent Fisheye Cameras with Distortion Removal | Matvei Panteleev et.al. | 2307.03602v1 | null |
2023-07-07 | RGB-D Mapping and Tracking in a Plenoxel Radiance Field | Andreas L. Teigen et.al. | 2307.03404v1 | link |
2023-07-04 | User-Friendly Safety Monitoring System for Manufacturing Cobots | Ye-Ji Mun et.al. | 2307.01886v1 | null |
2023-06-29 | One-2-3-45: Any Single Image to 3D Mesh in 45 Seconds without Per-Shape Optimization | Minghua Liu et.al. | 2306.16928v1 | link |
2023-06-23 | LightGlue: Local Feature Matching at Light Speed | Philipp Lindenberger et.al. | 2306.13643v1 | link |
2023-06-24 | 3D Reconstruction of Spherical Images based on Incremental Structure from Motion | San Jiang et.al. | 2306.12770v2 | link |
2023-06-26 | Infinite Photorealistic Worlds using Procedural Generation | Alexander Raistrick et.al. | 2306.09310v2 | null |
2023-06-15 | NAVI: Category-Agnostic Image Collections with High-Quality 3D Shape and Pose Annotations | Varun Jampani et.al. | 2306.09109v1 | link |
2023-06-15 | Enhancing Neural Rendering Methods with Image Augmentations | Juan C. Pérez et.al. | 2306.08904v1 | null |
2023-06-14 | Learning to Predict Scene-Level Implicit 3D from Posed RGBD Data | Nilesh Kulkarni et.al. | 2306.08671v1 | null |
2023-06-13 | Viewset Diffusion: (0-)Image-Conditioned 3D Generative Models from 2D Data | Stanislaw Szymanowicz et.al. | 2306.07881v1 | null |
2023-06-12 | Reconstructing Heterogeneous Cryo-EM Molecular Structures by Decomposing Them into Polymer Chains | Bongjin Koo et.al. | 2306.07274v1 | null |
2023-06-10 | 3D reconstruction using Structure for Motion | Kshitij Karnawat et.al. | 2306.06360v1 | link |
2023-06-15 | NERFBK: A High-Quality Benchmark for NERF-Based 3D Reconstruction | Ali Karami et.al. | 2306.06300v2 | link |
2023-06-12 | Neural Haircut: Prior-Guided Strand-Based Hair Reconstruction | Vanessa Sklyarova et.al. | 2306.05872v2 | link |
2023-06-08 | 2D Supervised Monocular 3D Object Detection by Global-to-Local 3D Reconstruction | Jiawei He et.al. | 2306.05418v1 | null |
2023-06-08 | Enhance-NeRF: Multiple Performance Evaluation for Neural Radiance Fields | Qianqiu Tan et.al. | 2306.05303v1 | link |
2023-06-07 | BU-CVKit: Extendable Computer Vision Framework for Species Independent Tracking and Analysis | Mahir Patel et.al. | 2306.04736v1 | null |
2023-06-09 | DiViNeT: 3D Reconstruction from Disparate Views via Neural Template Regularization | Aditya Vora et.al. | 2306.04699v2 | null |
2023-06-05 | BeyondPixels: A Comprehensive Review of the Evolution of Neural Radiance Fields | AKM Shahariar Azad Rabby et.al. | 2306.03000v1 | null |
2023-06-05 | Single-Stage 3D Geometry-Preserving Depth Estimation Model Training on Dataset Mixtures with Uncalibrated Stereo Data | Nikolay Patakin et.al. | 2306.02878v1 | null |
2023-06-05 | Computational 3D topographic microscopy from terabytes of data per sample | Kevin C. Zhou et.al. | 2306.02634v1 | null |
2023-06-08 | Adaptive Robotic Information Gathering via Non-Stationary Gaussian Processes | Weizhe Chen et.al. | 2306.01263v2 | link |
2023-06-01 | BUOL: A Bottom-Up Framework with Occupancy-aware Lifting for Panoptic 3D Scene Reconstruction From A Single Image | Tao Chu et.al. | 2306.00965v1 | link |
2023-05-31 | Humans in 4D: Reconstructing and Tracking Humans with Transformers | Shubham Goel et.al. | 2305.20091v1 | link |
2023-05-30 | Template-free Articulated Neural Point Clouds for Reposable View Synthesis | Lukas Uzolas et.al. | 2305.19065v1 | link |
2023-05-29 | Synfeal: A Data-Driven Simulator for End-to-End Camera Localization | Daniel Coelho et.al. | 2305.18260v1 | link |
2023-06-04 | VoxDet: Voxel Learning for Novel Instance Detection | Bowen Li et.al. | 2305.17220v3 | link |
2023-05-25 | Look Ma, No Hands! Agent-Environment Factorization of Egocentric Videos | Matthew Chang et.al. | 2305.16301v1 | null |
2023-05-25 | Domain-Adaptive Full-Face Gaze Estimation via Novel-View-Synthesis and Feature Disentanglement | Jiawei Qin et.al. | 2305.16140v1 | null |
2023-05-25 | Robust Category-Level 3D Pose Estimation from Synthetic Data | Jiahao Yang et.al. | 2305.16124v1 | null |
2023-05-25 | T2TD: Text-3D Generation Model based on Prior Knowledge Guidance | Weizhi Nie et.al. | 2305.15753v1 | null |
2023-05-23 | Cross3DVG: Baseline and Dataset for Cross-Dataset 3D Visual Grounding on Different RGB-D Scans | Taiki Miyanishi et.al. | 2305.13876v1 | link |
2023-05-22 | A three-dimensional MR-STAT protocol for high-resolution multi-parametric quantitative MRI | Hongyan Liu et.al. | 2305.13022v1 | null |
2023-05-29 | Chupa: Carving 3D Clothed Humans from Skinned Shape Priors using 2D Diffusion Probabilistic Models | Byungjun Kim et.al. | 2305.11870v2 | link |
2023-05-19 | Text2NeRF: Text-Driven 3D Scene Generation with Neural Radiance Fields | Jingbo Zhang et.al. | 2305.11588v1 | link |
2023-05-19 | RGB-D And Thermal Sensor Fusion: A Systematic Literature Review | Martin Brenner et.al. | 2305.11427v1 | null |
2023-05-18 | Progressive Learning of 3D Reconstruction Network from 2D GAN Data | Aysegul Dundar et.al. | 2305.11102v1 | null |
2023-05-18 | ConsistentNeRF: Enhancing Neural Radiance Fields with 3D Consistency for Sparse View Synthesis | Shoukang Hu et.al. | 2305.11031v1 | link |
2023-05-17 | Colonoscopy Coverage Revisited: Identifying Scanning Gaps in Real-Time | G. Leifman et.al. | 2305.10026v1 | null |
2023-05-15 | AutoRecon: Automated 3D Object Discovery and Reconstruction | Yuang Wang et.al. | 2305.08810v1 | null |
2023-05-11 | Towards a Better Understanding of the Computer Vision Research Community in Africa | Abdul-Hakeem Omotayo et.al. | 2305.06773v1 | null |
2023-05-10 | Scan2LoD3: Reconstructing semantic 3D building models at LoD3 using ray casting and Bayesian networks | Olaf Wysocki et.al. | 2305.06314v1 | null |
2023-05-08 | RelPose++: Recovering 6D Poses from Sparse-view Observations | Amy Lin et.al. | 2305.04926v1 | link |
2023-05-04 | UrbanBIS: a Large-scale Benchmark for Fine-grained Urban Building Instance Segmentation | Guoqing Yang et.al. | 2305.02627v1 | null |
2023-05-03 | Biological Hotspot Mapping in Coral Reefs with Robotic Visual Surveys | Daniel Yang et.al. | 2305.02330v1 | link |
2023-04-30 | Second-order Anisotropic Gaussian Directional Derivative Filters for Blob Detection | Jie Ren et.al. | 2305.00435v1 | null |
2023-04-29 | NSLF-OL: Online Learning of Neural Surface Light Fields alongside Real-time Incremental 3D Reconstruction | Yijun Yuan et.al. | 2305.00282v1 | null |
2023-04-23 | UHRNet: A Deep Learning-Based Method for Accurate 3D Reconstruction from a Single Fringe-Pattern | Yixiao Wang et.al. | 2304.14503v1 | link |
2023-04-27 | Learning Articulated Shape with Keypoint Pseudo-labels from Web Images | Anastasis Stathopoulos et.al. | 2304.14396v1 | null |
2023-05-03 | Combining HoloLens with Instant-NeRFs: Advanced Real-Time 3D Mobile Mapping | Dennis Haitz et.al. | 2304.14301v2 | null |
2023-04-25 | Shape-Net: Room Layout Estimation from Panoramic Images Robust to Occlusion using Knowledge Distillation with 3D Shapes as Additional Inputs | Mizuki Tabata et.al. | 2304.12624v1 | null |
2023-04-24 | Instant-3D: Instant Neural Radiance Field Training Towards On-Device AR/VR 3D Reconstruction | Sixu Li et.al. | 2304.12467v1 | null |
2023-04-24 | Unsupervised Style-based Explicit 3D Face Reconstruction from Single Image | Heng Yu et.al. | 2304.12455v1 | null |
2023-04-24 | gSDF: Geometry-Driven Signed Distance Functions for 3D Hand-Object Reconstruction | Zerui Chen et.al. | 2304.11970v1 | null |
2023-04-24 | Learning Visibility Field for Detailed 3D Human Reconstruction and Relighting | Ruichen Zheng et.al. | 2304.11900v1 | null |
2023-04-24 | NoiseTrans: Point Cloud Denoising with Transformers | Guangzhe Hou et.al. | 2304.11812v1 | null |
2023-04-20 | A Comparative Neural Radiance Field (NeRF) 3D Analysis of Camera Poses from HoloLens Trajectories and Structure from Motion | Miriam Jäger et.al. | 2304.10664v1 | null |
2023-04-20 | Reconstructing Signing Avatars From Video Using Linguistic Priors | Maria-Paola Forte et.al. | 2304.10482v1 | null |
2023-04-19 | Anything-3D: Towards Single-view Anything Reconstruction in the Wild | Qiuhong Shen et.al. | 2304.10261v1 | link |
2023-04-20 | A geometry-aware deep network for depth estimation in monocular endoscopy | Yongming Yang et.al. | 2304.10241v1 | link |
2023-04-19 | Tetra-NeRF: Representing Neural Radiance Fields Using Tetrahedra | Jonas Kulhanek et.al. | 2304.09987v1 | link |
2023-04-20 | Single-View View Synthesis with Self-Rectified Pseudo-Stereo | Yang Zhou et.al. | 2304.09527v2 | null |
2023-04-19 | 3 Dimensional Dense Reconstruction: A Review of Algorithms and Dataset | Yangming Li et.al. | 2304.09371v1 | null |
2023-04-18 | SurfelNeRF: Neural Surfel Radiance Fields for Online Photorealistic Reconstruction of Indoor Scenes | Yiming Gao et.al. | 2304.08971v1 | null |
2023-04-17 | Learning How To Robustly Estimate Camera Pose in Endoscopic Videos | Michel Hayoz et.al. | 2304.08023v1 | link |
2023-04-15 | Temporally Consistent Online Depth Estimation Using Point-Based Fusion | Numair Khan et.al. | 2304.07435v1 | link |
2023-04-17 | Single-Stage Diffusion NeRF: A Unified Approach to 3D Generation and Reconstruction | Hansheng Chen et.al. | 2304.06714v2 | link |
2023-04-12 | SiLK – Simple Learned Keypoints | Pierre Gleize et.al. | 2304.06194v1 | link |
2023-04-12 | Dynamic Voxel Grid Optimization for High-Fidelity RGB-D Supervised Surface Reconstruction | Xiangyu Xu et.al. | 2304.06178v1 | null |
2023-04-11 | EvAC3D: From Event-based Apparent Contours to 3D Models via Continuous Visual Hulls | Ziyun Wang et.al. | 2304.05296v1 | link |
2023-04-10 | Neural Lens Modeling | Wenqi Xian et.al. | 2304.04848v1 | null |
2023-04-10 | Evaluate Geometry of Radiance Field with Low-frequency Color Prior | Qihang Fang et.al. | 2304.04351v1 | link |
2023-04-11 | Analysis of Sampling Strategies for Implicit 3D Reconstruction | Q. Liu et.al. | 2304.03999v2 | null |
2023-04-08 | 3D GANs and Latent Space: A comprehensive survey | Satya Pratheek Tata et.al. | 2304.03932v1 | null |
2023-04-08 | Photometric Correction for Infrared Sensors | Jincheng Zhang et.al. | 2304.03930v1 | null |
2023-04-07 | ALIKED: A Lighter Keypoint and Descriptor Extraction Network via Deformable Transformation | Xiaoming Zhao et.al. | 2304.03608v1 | link |
2023-04-06 | Neural Fields meet Explicit Geometric Representation for Inverse Rendering of Urban Scenes | Zian Wang et.al. | 2304.03266v1 | null |
2023-04-06 | DeLiRa: Self-Supervised Depth, Light, and Radiance Fields | Vitor Guizilini et.al. | 2304.02797v1 | null |
2023-04-05 | Image Stabilization for Hololens Camera in Remote Collaboration | Gowtham Senthil et.al. | 2304.02736v1 | null |
2023-04-05 | Real-Time Dense 3D Mapping of Underwater Environments | Weihan Wang et.al. | 2304.02704v1 | link |
2023-04-04 | USTC FLICAR: A Multisensor Fusion Dataset of LiDAR-Inertial-Camera for Heavy-duty Autonomous Aerial Work Robots | Ziming Wang et.al. | 2304.01986v1 | null |
2023-04-04 | End-to-End Latency Optimization of Multi-view 3D Reconstruction for Disaster Response | Xiaojie Zhang et.al. | 2304.01488v1 | null |
2023-04-04 | FineRecon: Depth-aware Feed-forward Network for Detailed 3D Reconstruction | Noah Stier et.al. | 2304.01480v1 | link |
2023-04-03 | One-Shot View Planning for Fast and Complete Unknown Object Reconstruction | Sicong Pan et.al. | 2304.00910v1 | link |
2023-03-31 | LivePose: Online 3D Reconstruction from Monocular Video with Dynamic Camera Poses | Noah Stier et.al. | 2304.00054v1 | link |
2023-04-03 | Three-dimensional coherent diffraction snapshot imaging using extreme ultraviolet radiation from a free electron laser | Danny Fainozzi et.al. | 2303.18166v2 | null |
2023-03-30 | Enhanced Stable View Synthesis | Nishant Jain et.al. | 2303.17094v1 | null |
2023-03-29 | AirLine: Efficient Learnable Line Detection with Local Edge Voting | Xiao Lin et.al. | 2303.16500v1 | link |
2023-03-29 | Multi-View Azimuth Stereo via Tangent Space Consistency | Xu Cao et.al. | 2303.16447v1 | link |
2023-03-27 | NeUDF: Learning Unsigned Distance Fields from Multi-view Images for Reconstructing Non-watertight Models | Fei Hou et.al. | 2303.15368v1 | null |
2023-03-27 | TMO: Textured Mesh Acquisition of Objects with a Mobile Device by using Differentiable Rendering | Jaehoon Choi et.al. | 2303.15060v1 | null |
2023-03-26 | Clean-NeRF: Reformulating NeRF to account for View-Dependent Observations | Xinhang Liu et.al. | 2303.14707v1 | null |
2023-03-25 | PAniC-3D: Stylized Single-view 3D Reconstruction from Portraits of Anime Characters | Shuhong Chen et.al. | 2303.14587v1 | link |
2023-03-25 | LPFF: A Portrait Dataset for Face Generators Across Large Poses | Yiqian Wu et.al. | 2303.14407v1 | null |
2023-03-24 | BundleSDF: Neural 6-DoF Tracking and 3D Reconstruction of Unknown Objects | Bowen Wen et.al. | 2303.14158v1 | link |
2023-03-24 | Deformable Model Driven Neural Rendering for High-fidelity 3D Reconstruction of Human Heads Under Low-View Settings | Baixin Xu et.al. | 2303.13855v1 | link |
2023-03-24 | Seeing Through the Glass: Neural 3D Reconstruction of Object Inside a Transparent Container | Jinguang Tong et.al. | 2303.13805v1 | link |
2023-03-23 | SCADE: NeRFs from Space Carving with Ambiguity-Aware Depth Estimates | Mikaela Angelina Uy et.al. | 2303.13582v1 | null |
2023-03-21 | Real-time volumetric rendering of dynamic humans | Ignacio Rocco et.al. | 2303.11898v1 | null |
2023-03-20 | Zero-1-to-3: Zero-shot One Image to 3D Object | Ruoshi Liu et.al. | 2303.11328v1 | link |
2023-03-20 | DIME-Net: Neural Network-Based Dynamic Intrinsic Parameter Rectification for Cameras with Optical Image Stabilization System | Shu-Hao Yeh et.al. | 2303.11307v1 | null |
2023-03-20 | Ref-NeuS: Ambiguity-Reduced Neural Implicit Surface Learning for Multi-View Reconstruction with Reflection | Wenhang Ge et.al. | 2303.10840v1 | link |
2023-03-14 | FingerSLAM: Closed-loop Unknown Object Localization and Reconstruction from Visuo-tactile Feedback | Jialiang Zhao et.al. | 2303.07997v1 | null |
2023-03-11 | Normal-guided Garment UV Prediction for Human Re-texturing | Yasamin Jafarian et.al. | 2303.06504v1 | null |
2023-03-11 | Just Flip: Flipped Observation Generation and Optimization for Neural Radiance Fields to Cover Unobserved View | Minjae Lee et.al. | 2303.06335v1 | link |
2023-03-10 | ACR: Attention Collaboration-based Regressor for Arbitrary Two-Hand Reconstruction | Zhengdi Yu et.al. | 2303.05938v1 | link |
2023-03-10 | Structural Multiplane Image: Bridging Neural View Synthesis and 3D Reconstruction | Mingfang Zhang et.al. | 2303.05937v1 | null |
2023-03-08 | FastSurf: Fast Neural RGB-D Surface Reconstruction using Per-Frame Intrinsic Refinement and TSDF Fusion Prior Learning | Seunghwan Lee et.al. | 2303.04508v1 | link |
2023-03-08 | Corner Detection Based on Multi-directional Gabor Filters with Multi-scales | Huaqing Wang et.al. | 2303.04334v1 | null |
2023-03-08 | DroNeRF: Real-time Multi-agent Drone Pose Optimization for Computing Neural Radiance Fields | Dipam Patel et.al. | 2303.04322v1 | null |
2023-03-07 | Proactive Multi-Camera Collaboration For 3D Human Pose Estimation | Hai Ci et.al. | 2303.03767v1 | null |
2023-03-06 | System for 3D Acquisition and 3D Reconstruction using Structured Light for Sewer Line Inspection | Johannes Künzel et.al. | 2303.02978v1 | null |
2023-03-03 | Delicate Textured Mesh Recovery from NeRF via Adaptive Surface Refinement | Jiaxiang Tang et.al. | 2303.02091v1 | link |
2023-03-09 | MobileBrick: Building LEGO for 3D Reconstruction on Mobile Devices | Kejie Li et.al. | 2303.01932v2 | link |
2023-03-01 | Motion Compensation via Epipolar Consistency for In-Vivo X-Ray Microscopy | Mareike Thies et.al. | 2303.00449v1 | null |
2023-02-28 | 3D Coronary Vessel Reconstruction from Bi-Plane Angiography using Graph Convolutional Networks | Kit Mills Bransby et.al. | 2302.14795v1 | null |
2023-02-28 | Mask3D: Pre-training 2D Vision Transformers by Learning Masked 3D Priors | Ji Hou et.al. | 2302.14746v1 | null |
2023-02-27 | UMIFormer: Mining the Correlations between Similar Tokens for Multi-View 3D Reconstruction | Zhenwei Zhu et.al. | 2302.13987v1 | link |
2023-02-26 | Perceiving Unseen 3D Objects by Poking the Objects | Linghao Chen et.al. | 2302.13375v1 | null |
2023-02-25 | SUPS: A Simulated Underground Parking Scenario Dataset for Autonomous Driving | Jiawei Hou et.al. | 2302.12966v1 | link |
2023-02-24 | 3D Surface Reconstruction in the Wild by Deforming Shape Priors from Synthetic Data | Nicolai Häni et.al. | 2302.12883v1 | null |
2023-02-23 | View Consistency Aware Holistic Triangulation for 3D Human Pose Estimation | Xiaoyue Wan et.al. | 2302.11301v2 | null |
2023-02-23 | $PC^2$ : Projection-Conditioned Point Cloud Diffusion for Single-Image 3D Reconstruction | Luke Melas-Kyriazi et.al. | 2302.10668v2 | link |
2023-02-23 | RealFusion: 360° Reconstruction of Any Object from a Single Image | Luke Melas-Kyriazi et.al. | 2302.10663v2 | null |
2023-02-20 | UAVStereo: A Multiple Resolution Dataset for Stereo Matching in UAV Scenarios | Zhang Xiaoyi et.al. | 2302.10082v1 | link |
2023-02-14 | HR-NeuS: Recovering High-Frequency Surface Geometry via Neural Implicit Surfaces | Erich Liang et.al. | 2302.06793v1 | null |
2023-02-14 | Boosted ab initio Cryo-EM 3D Reconstruction with ACE-EM | Lin Yao et.al. | 2302.06091v2 | null |
2023-02-11 | 3D Colored Shape Reconstruction from a Single RGB Image through Diffusion | Bo Li et.al. | 2302.05573v1 | null |
2023-02-09 | 3D reconstruction of spherical images: A review of techniques, applications, and prospects | San Jiang et.al. | 2302.04495v1 | null |
2023-02-09 | PredRecon: A Prediction-boosted Planning Framework for Fast and High-quality Autonomous Aerial Reconstruction | Chen Feng et.al. | 2302.04488v1 | link |
2023-02-07 | S4R: Self-Supervised Semantic Scene Reconstruction from RGB-D Scans | Junwen Huang et.al. | 2302.03640v1 | null |
2023-01-30 | Mono-STAR: Mono-camera Scene-level Tracking and Reconstruction | Haonan Chang et.al. | 2301.13244v1 | link |
2023-01-27 | A Comparison of Tiny-nerf versus Spatial Representations for 3d Reconstruction | Saulo Abraham Gante et.al. | 2301.11522v1 | null |
2023-01-25 | Local Feature Extraction from Salient Regions by Feature Map Transformation | Yerim Jung et.al. | 2301.10413v1 | null |
2023-02-02 | 3D Reconstruction of Non-cooperative Resident Space Objects using Instant NGP-accelerated NeRF and D-NeRF | Trupti Mahendrakar et.al. | 2301.09060v2 | null |
2023-01-19 | Parallelized computational 3D video microscopy of freely moving organisms at multiple gigapixels per second | Kevin C. Zhou et.al. | 2301.08351v1 | link |
2023-01-19 | Multiview Compressive Coding for 3D Reconstruction | Chao-Yuan Wu et.al. | 2301.08247v1 | link |
2023-01-19 | Regularizing disparity estimation via multi task learning with structured light reconstruction | Alistair Weld et.al. | 2301.08140v1 | null |
2023-01-12 | Edge Preserving Implicit Surface Representation of Point Clouds | Xiaogang Wang et.al. | 2301.04860v1 | null |
2023-01-11 | Elevation Estimation-Driven Building 3D Reconstruction from Single-View Remote Sensing Imagery | Yongqiang Mao et.al. | 2301.04581v1 | null |
2023-01-11 | First 3D reconstruction of a blast furnace using muography | Amélie Cohu et.al. | 2301.04354v1 | null |
2023-01-04 | Towards a Pipeline for Real-Time Visualization of Faces for VR-based Telepresence and Live Broadcasting Utilizing Neural Rendering | Philipp Ladwig et.al. | 2301.01490v1 | link |
2023-01-03 | BS3D: Building-scale 3D Reconstruction from RGB-D Images | Janne Mustaniemi et.al. | 2301.01057v1 | null |
2022-12-31 | Ponder: Point Cloud Pre-training via Neural Rendering | Di Huang et.al. | 2301.00157v1 | null |
2022-12-28 | NeMo: 3D Neural Motion Fields from Multiple Video Instances of the Same Action | Kuan-Chieh Wang et.al. | 2212.13660v1 | link |
2022-12-24 | Polarimetric Multi-View Inverse Rendering | Jinyu Zhao et.al. | 2212.12721v1 | null |
(<a href=#Updated-on-20240404>back to top</a>)
Publish Date | Title | Authors | Code | |
---|---|---|---|---|
2024-04-03 | Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction | Keyu Tian et.al. | 2404.02905v1 | link |
2024-04-03 | LidarDM: Generative LiDAR Simulation in a Generated World | Vlas Zyrianov et.al. | 2404.02903v1 | null |
2024-04-03 | DeiT-LT Distillation Strikes Back for Vision Transformer Training on Long-Tailed Datasets | Harsh Rangwani et.al. | 2404.02900v1 | link |
2024-04-03 | MatAtlas: Text-driven Consistent Geometry Texturing and Material Assignment | Duygu Ceylan et.al. | 2404.02899v1 | null |
2024-04-03 | A Mean Field Game Model for Timely Computation in Edge Computing Systems | Shubham Aggarwal et.al. | 2404.02898v1 | null |
2024-04-03 | Deep Image Composition Meets Image Forgery | Eren Tahir et.al. | 2404.02897v1 | link |
2024-04-03 | ChatGLM-Math: Improving Math Problem-Solving in Large Language Models with a Self-Critique Pipeline | Yifan Xu et.al. | 2404.02893v1 | null |
2024-04-03 | PoCo: Point Context Cluster for RGBD Indoor Place Recognition | Jing Liang et.al. | 2404.02885v1 | null |
2024-04-02 | Segment Any 3D Object with Language | Seungjun Lee et.al. | 2404.02157v1 | null |
2024-04-02 | Dynamic Pre-training: Towards Efficient and Scalable All-in-One Image Restoration | Akshay Dudhane et.al. | 2404.02154v1 | null |
2024-04-02 | GeneAvatar: Generic Expression-Aware Volumetric Head Avatar Editing from a Single Image | Chong Bao et.al. | 2404.02152v1 | null |
2024-04-02 | Diffusion $^2$ : Dynamic 3D Content Generation via Score Composition of Orthogonal Diffusion Models | Zeyu Yang et.al. | 2404.02148v1 | link |
2024-04-02 | Harder, Better, Faster, Stronger: Interactive Visualization for Human-Centered AI Tools | Md Naimul Hoque et.al. | 2404.02147v1 | null |
2024-04-02 | Iterated Learning Improves Compositionality in Large Vision-Language Models | Chenhao Zheng et.al. | 2404.02145v1 | null |
2024-04-02 | Multiparametric quantification and visualization of liver fat using ultrasound | Jihye Baek et.al. | 2404.02143v1 | null |
2024-03-29 | Gecko: Versatile Text Embeddings Distilled from Large Language Models | Jinhyuk Lee et.al. | 2403.20327v1 | null |
2024-03-29 | Shaving Logs via Large Sieve Inequality: Faster Algorithms for Sparse Convolution and More | Ce Jin et.al. | 2403.20326v1 | null |
2024-03-29 | Structure and Dynamics of Magneto-Inertial, Differentially Rotating Laboratory Plasmas | V. Valenzuela-Villaseca et.al. | 2403.20321v1 | null |
2024-03-29 | SeaBird: Segmentation in Bird’s View with Dice Loss Improves Monocular 3D Detection of Large Objects | Abhinav Kumar et.al. | 2403.20318v1 | link |
2024-03-29 | Convolutional Prompting meets Language Models for Continual Learning | Anurag Roy et.al. | 2403.20317v1 | null |
2024-03-29 | Optimal Communication for Classic Functions in the Coordinator Model and Beyond | Hossein Esfandiari et.al. | 2403.20307v1 | null |
2024-03-28 | GaussianCube: Structuring Gaussian Splatting using Optimal Transport for 3D Generative Modeling | Bowen Zhang et.al. | 2403.19655v1 | null |
2024-03-28 | Detecting Image Attribution for Text-to-Image Diffusion Models in RGB and Beyond | Katherine Xu et.al. | 2403.19653v1 | link |
2024-03-28 | InterDreamer: Zero-Shot Text to 3D Dynamic Human-Object Interaction | Sirui Xu et.al. | 2403.19652v1 | null |
2024-03-28 | GraspXL: Generating Grasping Motions for Diverse Objects at Scale | Hui Zhang et.al. | 2403.19649v1 | null |
2024-03-28 | Sparse Feature Circuits: Discovering and Editing Interpretable Causal Graphs in Language Models | Samuel Marks et.al. | 2403.19647v1 | link |
2024-03-28 | GANTASTIC: GAN-based Transfer of Interpretable Directions for Disentangled Image Editing in Text-to-Image Diffusion Models | Yusuf Dalva et.al. | 2403.19645v1 | null |
2024-03-27 | Real Acoustic Fields: An Audio-Visual Room Acoustics Dataset and Benchmark | Ziyang Chen et.al. | 2403.18821v1 | null |
2024-03-27 | MetaCap: Meta-learning Priors from Multi-View Imagery for Sparse-view Human Performance Capture and Rendering | Guoxing Sun et.al. | 2403.18820v1 | null |
2024-03-27 | ObjectDrop: Bootstrapping Counterfactuals for Photorealistic Object Removal and Insertion | Daniel Winter et.al. | 2403.18818v1 | null |
2024-03-27 | Garment3DGen: 3D Garment Stylization and Texture Generation | Nikolaos Sarafianos et.al. | 2403.18816v1 | null |
2024-03-27 | Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models | Yanwei Li et.al. | 2403.18814v1 | link |
2024-03-27 | Duolando: Follower GPT with Off-Policy Reinforcement Learning for Dance Accompaniment | Li Siyao et.al. | 2403.18811v1 | null |
2024-03-28 | ECoDepth: Effective Conditioning of Diffusion Models for Monocular Depth Estimation | Suraj Patni et.al. | 2403.18807v2 | link |
2024-03-26 | ConvoFusion: Multi-Modal Conversational Diffusion for Co-Speech Gesture Synthesis | Muhammad Hamza Mughal et.al. | 2403.17936v1 | null |
2024-03-26 | OmniVid: A Generative Framework for Universal Video Understanding | Junke Wang et.al. | 2403.17935v1 | link |
2024-03-26 | SLEDGE: Synthesizing Simulation Environments for Driving Agents with Generative Models | Kashyap Chitta et.al. | 2403.17933v1 | null |
2024-03-26 | MAGIS: LLM-Based Multi-Agent Framework for GitHub Issue Resolution | Wei Tao et.al. | 2403.17927v1 | null |
2024-03-26 | AID: Attention Interpolation of Text-to-Image Diffusion | Qiyuan He et.al. | 2403.17924v1 | link |
2024-03-26 | The Need for Speed: Pruning Transformers with One Recipe | Samir Khaki et.al. | 2403.17921v1 | link |
2024-03-26 | TC4D: Trajectory-Conditioned Text-to-4D Generation | Sherwin Bahmani et.al. | 2403.17920v1 | null |
2024-03-26 | AgentStudio: A Toolkit for Building General Virtual Agents | Longtao Zheng et.al. | 2403.17918v1 | null |
2024-03-25 | Exploiting Priors from 3D Diffusion Models for RGB-Based One-Shot View Planning | Sicong Pan et.al. | 2403.16803v1 | null |
2024-03-25 | Iterative Refinement of Project-Level Code Context for Precise Code Generation with Compiler Feedback | Zhangqian Bi et.al. | 2403.16792v1 | null |
2024-03-25 | Iso-Diffusion: Improving Diffusion Probabilistic Models Using the Isotropy of the Additive Gaussian Noise | Dilum Fernando et.al. | 2403.16790v1 | null |
2024-03-25 | HPL-ESS: Hybrid Pseudo-Labeling for Unsupervised Event-based Semantic Segmentation | Linglin Jing et.al. | 2403.16788v1 | null |
2024-03-25 | Creating a Digital Twin of Spinal Surgery: A Proof of Concept | Jonas Hein et.al. | 2403.16736v1 | null |
2024-03-25 | Improving Diffusion Models’s Data-Corruption Resistance using Scheduled Pseudo-Huber Loss | Artem Khrapov et.al. | 2403.16728v1 | link |
2024-03-22 | DiffusionMTL: Learning Multi-Task Denoising Diffusion Model from Partially Annotated Data | Hanrong Ye et.al. | 2403.15389v1 | null |
2024-03-22 | LATTE3D: Large-scale Amortized Text-To-Enhanced3D Synthesis | Kevin Xie et.al. | 2403.15385v1 | null |
2024-03-22 | ThemeStation: Generating Theme-Aware 3D Assets from Few Exemplars | Zhenwei Wang et.al. | 2403.15383v1 | null |
2024-03-22 | DragAPart: Learning a Part-Level Motion Prior for Articulated Objects | Ruining Li et.al. | 2403.15382v1 | null |
2024-03-22 | Long-CLIP: Unlocking the Long-Text Capability of CLIP | Beichen Zhang et.al. | 2403.15378v1 | link |
2024-03-22 | InternVideo2: Scaling Video Foundation Models for Multimodal Video Understanding | Yi Wang et.al. | 2403.15377v1 | link |
2024-03-22 | A Modular, End-to-End Next-Generation Network Testbed: Towards a Fully Automated Network Management Platform | Ali Chouman et.al. | 2403.15376v1 | null |
2024-03-21 | Zero-Shot Multi-Object Shape Completion | Shun Iwase et.al. | 2403.14628v1 | null |
2024-03-21 | MVSplat: Efficient 3D Gaussian Splatting from Sparse Multi-View Images | Yuedong Chen et.al. | 2403.14627v1 | link |
2024-03-21 | Simplified Diffusion Schrödinger Bridge | Zhicong Tang et.al. | 2403.14623v1 | link |
2024-03-21 | GRM: Large Gaussian Reconstruction Model for Efficient 3D Reconstruction and Generation | Yinghao Xu et.al. | 2403.14621v1 | link |
2024-03-21 | ClusteringSDF: Self-Organized Neural Implicit Surfaces for 3D Decomposition | Tianhao Wu et.al. | 2403.14619v1 | null |
2024-03-21 | Videoshop: Localized Semantic Video Editing with Noise-Extrapolated Diffusion Inversion | Xiang Fan et.al. | 2403.14617v1 | null |
2024-03-21 | Hierarchical Text-to-Vision Self Supervised Alignment for Improved Histopathology Representation Learning | Hasindri Watawana et.al. | 2403.14616v1 | link |
2024-03-21 | DreamReward: Text-to-3D Generation with Human Preference | Junliang Ye et.al. | 2403.14613v1 | null |
2024-03-21 | Explorative Inbetweening of Time and Space | Haiwen Feng et.al. | 2403.14611v1 | null |
2024-03-20 | On Pretraining Data Diversity for Self-Supervised Learning | Hasan Abed Al Kader Hammoud et.al. | 2403.13808v1 | link |
2024-03-20 | Editing Massive Concepts in Text-to-Image Diffusion Models | Tianwei Xiong et.al. | 2403.13807v1 | link |
2024-03-20 | Learning from Models and Data for Visual Grounding | Ruozhen He et.al. | 2403.13804v1 | null |
2024-03-20 | Bounding Box Stability against Feature Dropout Reflects Detector Generalization across Environments | Yang Yang et.al. | 2403.13803v1 | link |
2024-03-20 | ZigMa: Zigzag Mamba Diffusion Model | Vincent Tao Hu et.al. | 2403.13802v1 | link |
2024-03-20 | Natural Language as Polices: Reasoning for Coordinate-Level Embodied Control with LLMs | Yusuke Mikami et.al. | 2403.13801v1 | link |
2024-03-20 | TimeRewind: Rewinding Time with Image-and-Events Video Diffusion | Jingxi Chen et.al. | 2403.13800v1 | null |
2024-03-20 | Reverse Training to Nurse the Reversal Curse | Olga Golovneva et.al. | 2403.13799v1 | null |
2024-03-20 | Hierarchical NeuroSymbolic Approach for Action Quality Assessment | Lauren Okamoto et.al. | 2403.13798v1 | null |
2024-03-20 | Bridge the Modality and Capacity Gaps in Vision-Language Model Selection | Chao Yi et.al. | 2403.13797v1 | null |
2024-03-19 | LLMLingua-2: Data Distillation for Efficient and Faithful Task-Agnostic Prompt Compression | Zhuoshi Pan et.al. | 2403.12968v1 | link |
2024-03-19 | Wear-Any-Way: Manipulable Virtual Try-on via Sparse Correspondence Alignment | Mengting Chen et.al. | 2403.12965v1 | null |
2024-03-19 | Negative Yields Positive: Unified Dual-Path Adapter for Vision-Language Models | Ce Zhang et.al. | 2403.12964v1 | link |
2024-03-19 | FouriScale: A Frequency Perspective on Training-Free High-Resolution Image Synthesis | Linjiang Huang et.al. | 2403.12963v1 | link |
2024-03-19 | TexTile: A Differentiable Metric for Texture Tileability | Carlos Rodriguez-Pardo et.al. | 2403.12961v1 | null |
2024-03-19 | FaceXFormer: A Unified Transformer for Facial Analysis | Kartik Narayan et.al. | 2403.12960v1 | link |
2024-03-19 | GVGEN: Text-to-3D Generation with Volumetric Representation | Xianglong He et.al. | 2403.12957v1 | null |
2024-03-19 | Abiogenesis: a possible quantum interpretation of the telepoietic conjecture | Vittorio Cocchi et.al. | 2403.12955v1 | null |
2024-03-19 | Just Shift It: Test-Time Prototype Shifting for Zero-Shot Generalization with Vision-Language Models | Elaine Sui et.al. | 2403.12952v1 | link |
2024-03-18 | RIS-aided Single-frequency 3D Imaging by Exploiting Multi-view Image Correlations | Yixuan Huang et.al. | 2403.11764v1 | null |
2024-03-19 | Full-Duplex MU-MIMO Systems with Coarse Quantization: How Many Bits Do We Need? | Seunghyeong Yoo et.al. | 2403.11762v2 | null |
2024-03-18 | Why E.T. Can’t Phone Home: A Global View on IP-based Geoblocking at VoWiFi | Gabriel Karl Gegenhuber et.al. | 2403.11759v1 | null |
2024-03-18 | Meta-Prompting for Automating Zero-shot Visual Recognition with LLMs | M. Jehanzeb Mirza et.al. | 2403.11755v1 | link |
2024-03-18 | Asymptotically Optimal Codes for $(t,s)$ -Burst Error | Yubo Sun et.al. | 2403.11750v1 | null |
2024-03-18 | Embedded Named Entity Recognition using Probing Classifiers | Nicholas Popovič et.al. | 2403.11747v1 | null |
2024-03-18 | Revisiting Tensor Basis Neural Networks for Reynolds stress modeling: application to plane channel and square duct flows | Jiayi Cai et.al. | 2403.11746v1 | null |
2024-03-18 | Matter and cosmogenesis in Kant’s Theory of the Heavens | Garance Benoit et.al. | 2403.11710v1 | null |
2024-03-18 | Significant impact of light-matter strong coupling on chiral nonlinear optical effect | Daichi Okada et.al. | 2403.11709v1 | null |
2024-03-18 | Generalized Multi-Source Inference for Text Conditioned Music Diffusion Models | Emilian Postolache et.al. | 2403.11706v1 | link |
2024-03-18 | Virbo: Multimodal Multilingual Avatar Video Generation in Digital Marketing | Juan Zhang et.al. | 2403.11700v1 | null |
2024-03-18 | Urban Scene Diffusion through Semantic Occupancy Map | Junge Zhang et.al. | 2403.11697v1 | null |
2024-03-18 | Generalization error of spectral algorithms | Maksim Velikanov et.al. | 2403.11696v1 | null |
2024-03-18 | Beamforming Design for Semantic-Bit Coexisting Communication System | Maojun Zhang et.al. | 2403.11693v1 | null |
2024-03-15 | P-MapNet: Far-seeing Map Generator Enhanced by both SDMap and HDMap Priors | Zhou Jiang et.al. | 2403.10521v1 | null |
2024-03-15 | Lodge: A Coarse to Fine Diffusion Network for Long Dance Generation Guided by the Characteristic Dance Primitives | Ronghui Li et.al. | 2403.10518v1 | link |
2024-03-15 | FeatUp: A Model-Agnostic Framework for Features at Any Resolution | Stephanie Fu et.al. | 2403.10516v1 | link |
2024-03-15 | A Novel Framework for Multi-Person Temporal Gaze Following and Social Gaze Prediction | Anshul Gupta et.al. | 2403.10511v1 | null |
2024-03-15 | Demystifying Faulty Code with LLM: Step-by-Step Reasoning for Explainable Fault Localization | Ratnadira Widyasari et.al. | 2403.10507v1 | null |
2024-03-15 | Belief Change based on Knowledge Measures | Umberto Straccia et.al. | 2403.10502v1 | null |
2024-03-14 | SCP-Diff: Photo-Realistic Semantic Image Synthesis with Spatial-Categorical Joint Prior | Huan-ang Gao et.al. | 2403.09638v1 | null |
2024-03-14 | GaussianGrasper: 3D Language Gaussian Splatting for Open-vocabulary Robotic Grasping | Yuhang Zheng et.al. | 2403.09637v1 | link |
2024-03-14 | Dynamic Memory Compression: Retrofitting LLMs for Accelerated Inference | Piotr Nawrot et.al. | 2403.09636v1 | null |
2024-03-14 | OneTracker: Unifying Visual Object Tracking with Foundation Models and Efficient Tuning | Lingyi Hong et.al. | 2403.09634v1 | null |
2024-03-14 | Holo-Relighting: Controllable Volumetric Portrait Relighting from a Single Image | Yiqun Mei et.al. | 2403.09632v1 | null |
2024-03-14 | 3D-VLA: A 3D Vision-Language-Action Generative World Model | Haoyu Zhen et.al. | 2403.09631v1 | null |
2024-03-14 | Generalized Predictive Model for Autonomous Driving | Jiazhi Yang et.al. | 2403.09630v1 | link |
2024-03-14 | Quiet-STaR: Language Models Can Teach Themselves to Think Before Speaking | Eric Zelikman et.al. | 2403.09629v1 | link |
2024-03-14 | Make-Your-3D: Fast and Consistent Subject-Driven 3D Content Generation | Fangfu Liu et.al. | 2403.09625v1 | null |
2024-03-14 | Glyph-ByT5: A Customized Text Encoder for Accurate Visual Text Rendering | Zeyu Liu et.al. | 2403.09622v1 | null |
2024-03-13 | FastMAC: Stochastic Spectral Sampling of Correspondence Graph | Yifei Zhang et.al. | 2403.08770v1 | link |
2024-03-13 | VLOGGER: Multimodal Diffusion for Embodied Avatar Synthesis | Enric Corona et.al. | 2403.08764v1 | null |
2024-03-13 | A local model for the optical energy and momentum transfer in dielectric media and the microscopic origin of Abraham’s force density | B. Anghinoni et.al. | 2403.08752v1 | null |
2024-03-13 | iCONTRA: Toward Thematic Collection Design Via Interactive Concept Transfer | Dinh-Khoi Vo et.al. | 2403.08746v1 | link |
2024-03-12 | Rethinking Generative Large Language Model Evaluation for Semantic Comprehension | Fangyun Wei et.al. | 2403.07872v1 | null |
2024-03-12 | TeleMoMa: A Modular and Versatile Teleoperation System for Mobile Manipulation | Shivin Dass et.al. | 2403.07869v1 | null |
2024-03-12 | Exploring Safety Generalization Challenges of Large Language Models via Code | Qibing Ren et.al. | 2403.07865v1 | null |
2024-03-12 | Bridging Different Language Models and Generative Vision Models for Text-to-Image Generation | Shihao Zhao et.al. | 2403.07860v1 | link |
2024-03-12 | Fairness Feedback Loops: Training on Synthetic Data Amplifies Bias | Sierra Wyllie et.al. | 2403.07857v1 | null |
2024-03-12 | Quantifying and Mitigating Privacy Risks for Tabular Generative Models | Chaoyi Zhu et.al. | 2403.07842v1 | null |
2024-03-11 | A representation-learning game for classes of prediction tasks | Neria Uzan et.al. | 2403.06971v1 | null |
2024-03-11 | The pitfalls of next-token prediction | Gregor Bachmann et.al. | 2403.06963v1 | link |
2024-03-11 | Optimizing Latent Graph Representations of Surgical Scenes for Zero-Shot Domain Transfer | Siddhant Satyanaik et.al. | 2403.06953v1 | null |
2024-03-11 | SELMA: Learning and Merging Skill-Specific Text-to-Image Experts with Auto-Generated Data | Jialu Li et.al. | 2403.06952v1 | null |
2024-03-08 | Tell, Don’t Show!: Language Guidance Eases Transfer Across Domains in Images and Videos | Tarun Kalluri et.al. | 2403.05535v1 | null |
2024-03-08 | Tune without Validation: Searching for Learning Rate and Weight Decay on Training Sets | Lorenzo Brigato et.al. | 2403.05532v1 | null |
2024-03-08 | Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context | Machel Reid et.al. | 2403.05530v1 | null |
2024-03-08 | The Computational Complexity of Learning Gaussian Single-Index Models | Alex Damian et.al. | 2403.05529v1 | null |
2024-03-08 | GEAR: An Efficient KV Cache Compression Recipefor Near-Lossless Generative Inference of LLM | Hao Kang et.al. | 2403.05527v1 | link |
2024-03-08 | Beyond Finite Data: Towards Data-free Out-of-distribution Generalization via Extrapola | Yijiang Li et.al. | 2403.05523v1 | null |
2024-03-08 | Bias-Augmented Consistency Training Reduces Biased Reasoning in Chain-of-Thought | James Chua et.al. | 2403.05518v1 | link |
2024-03-07 | BloomGML: Graph Machine Learning through the Lens of Bilevel Optimization | Amber Yijia Zheng et.al. | 2403.04763v1 | link |
2024-03-07 | Lifelong Intelligence Beyond the Edge using Hyperdimensional Computing | Xiaofan Yu et.al. | 2403.04759v1 | link |
2024-03-07 | KnowledgeVIS: Interpreting Language Models by Comparing Fill-in-the-Blank Prompts | Adam Coscia et.al. | 2403.04758v1 | link |
2024-03-07 | Preliminary Guidelines For Combining Data Integration and Visual Data Analysis | Adam Coscia et.al. | 2403.04757v1 | link |
2024-03-07 | Mechanism for Decision-aware Collaborative Federated Learning: A Pitfall of Shapley Values | Meng Qi et.al. | 2403.04753v1 | null |
2024-03-07 | JAX-SPH: A Differentiable Smoothed Particle Hydrodynamics Framework | Artur P. Toshev et.al. | 2403.04750v1 | link |
2024-03-07 | A General Calibrated Regret Metric for Detecting and Mitigating Human-Robot Interaction Failures | Kensuke Nakamura et.al. | 2403.04745v1 | null |
2024-03-06 | Backtracing: Retrieving the Cause of the Query | Rose E. Wang et.al. | 2403.03956v1 | link |
2024-03-06 | 3D Diffusion Policy | Yanjie Ze et.al. | 2403.03954v1 | link |
2024-03-06 | Bridging Language and Items for Retrieval and Recommendation | Yupeng Hou et.al. | 2403.03952v1 | link |
2024-03-06 | Can Audio Reveal Music Performance Difficulty? Insights from the Piano Syllabus Dataset | Pedro Ramoneda et.al. | 2403.03947v1 | null |
2024-03-06 | Separate and Detailed Treatment of Absolute Signal and Noise Enables NMR Under Adverse Circumstances | A Guinness et.al. | 2403.03943v1 | null |
2024-03-06 | The Heuristic Core: Understanding Subnetwork Generalization in Pretrained Language Models | Adithya Bhaskar et.al. | 2403.03942v1 | link |
2024-03-06 | GUIDE: Guidance-based Incremental Learning with Diffusion Models | Bartosz Cywiński et.al. | 2403.03938v1 | link |
2024-03-05 | LC-Tsalis-INF: Generalized Best-of-Both-Worlds Linear Contextual Bandits | Masahiro Kato et.al. | 2403.03219v1 | null |
2024-03-05 | The WMDP Benchmark: Measuring and Reducing Malicious Use With Unlearning | Nathaniel Li et.al. | 2403.03218v1 | null |
2024-03-05 | Self-supervised 3D Patient Modeling with Multi-modal Attentive Fusion | Meng Zheng et.al. | 2403.03217v1 | null |
2024-03-05 | A Safety-Critical Framework for UGVs in Complex Environments: A Data-Driven Discrepancy-Aware Approach | Skylar X. Wei et.al. | 2403.03215v1 | null |
2024-03-05 | Scaling Rectified Flow Transformers for High-Resolution Image Synthesis | Patrick Esser et.al. | 2403.03206v1 | null |
2024-03-05 | CLEVR-POC: Reasoning-Intensive Visual Question Answering in Partially Observable Environments | Savitha Sam Abraham et.al. | 2403.03203v1 | null |
2024-03-03 | Bandit Profit-maximization for Targeted Marketing | Joon Suk Huh et.al. | 2403.01361v1 | null |
2024-03-03 | ModelWriter: Text & Model-Synchronized Document Engineering Platform | Ferhat Erata et.al. | 2403.01359v1 | null |
2024-03-03 | Improving Uncertainty Sampling with Bell Curve Weight Function | Zan-Kai Chong et.al. | 2403.01352v1 | null |
2024-03-03 | Efficient FIR filtering with Bit Layer Multiply Accumulator | Vincenzo Liguori et.al. | 2403.01351v1 | null |
2024-03-02 | ShapeBoost: Boosting Human Shape Estimation with Part-Based Parameterization and Clothing-Preserving Augmentation | Siyuan Bian et.al. | 2403.01345v1 | null |
2024-02-29 | DistriFusion: Distributed Parallel Inference for High-Resolution Diffusion Models | Muyang Li et.al. | 2402.19481v1 | link |
2024-02-29 | Panda-70M: Captioning 70M Videos with Multiple Cross-Modality Teachers | Tsai-Shien Chen et.al. | 2402.19479v1 | null |
2024-02-29 | Learning a Generalized Physical Face Model From Data | Lingchen Yang et.al. | 2402.19477v1 | null |
2024-02-29 | The Counterfeit Conundrum: Can Code Language Models Grasp the Nuances of Their Incorrect Generations? | Alex Gu et.al. | 2402.19475v1 | null |
2024-02-29 | The All-Seeing Project V2: Towards General Relation Comprehension of the Open World | Weiyun Wang et.al. | 2402.19474v1 | link |
2024-02-29 | Retrieval-Augmented Generation for AI-Generated Content: A Survey | Penghao Zhao et.al. | 2402.19473v1 | link |
2024-02-29 | Loose LIPS Sink Ships: Asking Questions in Battleship with Language-Informed Program Sampling | Gabriel Grand et.al. | 2402.19471v1 | null |
2024-02-29 | Humanoid Locomotion as Next Token Prediction | Ilija Radosavovic et.al. | 2402.19469v1 | null |
2024-02-28 | UniMODE: Unified Monocular 3D Object Detection | Zhuoling Li et.al. | 2402.18573v1 | null |
2024-02-28 | Arithmetic Control of LLMs for Diverse User Preferences: Directional Preference Alignment with Multi-Objective Rewards | Haoxiang Wang et.al. | 2402.18571v1 | link |
2024-02-28 | Diffusion Language Models Are Versatile Protein Learners | Xinyou Wang et.al. | 2402.18567v1 | null |
2024-02-28 | Approaching Human-Level Forecasting with Language Models | Danny Halawi et.al. | 2402.18563v1 | null |
2024-02-27 | The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits | Shuming Ma et.al. | 2402.17764v1 | null |
2024-02-27 | Reducing Unnecessary Alerts in Pedestrian Protection Systems Based on P2V Communications | Ignacio Soto et.al. | 2402.17763v1 | null |
2024-02-27 | Towards Optimal Learning of Language Models | Yuxian Gu et.al. | 2402.17759v1 | null |
2024-02-27 | ADL4D: Towards A Contextually Rich Dataset for 4D Activities of Daily Living | Marsil Zakour et.al. | 2402.17758v1 | null |
2024-02-27 | Evaluating Very Long-Term Conversational Memory of LLM Agents | Adyasha Maharana et.al. | 2402.17753v1 | null |
2024-02-26 | Pre-training Cross-lingual Open Domain Question Answering with Large-scale Synthetic Supervision | Fan Jiang et.al. | 2402.16508v1 | link |
2024-02-26 | Stochastic Conditional Diffusion Models for Semantic Image Synthesis | Juyeon Ko et.al. | 2402.16506v1 | null |
2024-02-26 | SAND: Decoupling Sanitization from Fuzzing for Low Overhead | Ziqiao Kong et.al. | 2402.16497v1 | null |
2024-02-26 | Intelligent Known and Novel Aircraft Recognition – A Shift from Classification to Similarity Learning for Combat Identification | Ahmad Saeed et.al. | 2402.16486v1 | null |
2024-02-23 | Seamless Human Motion Composition with Blended Positional Encodings | German Barquero et.al. | 2402.15509v1 | link |
2024-02-23 | AgentOhana: Design Unified Data and Training Pipeline for Effective Agent Learning | Jianguo Zhang et.al. | 2402.15506v1 | link |
2024-02-23 | Co-Supervised Learning: Improving Weak-to-Strong Generalization with Hierarchical Mixture of Experts | Yuejiang Liu et.al. | 2402.15505v1 | null |
2024-02-23 | Gen4Gen: Generative Data Pipeline for Generative Multi-Concept Composition | Chun-Hsiao Yeh et.al. | 2402.15504v1 | link |
2024-02-23 | API-BLEND: A Comprehensive Corpora for Training and Benchmarking API LLMs | Kinjal Basu et.al. | 2402.15491v1 | null |
2024-02-22 | PALO: A Polyglot Large Multimodal Model for 5B People | Muhammad Maaz et.al. | 2402.14818v1 | link |
2024-02-22 | Cameras as Rays: Pose Estimation via Ray Diffusion | Jason Y. Zhang et.al. | 2402.14817v1 | null |
2024-02-22 | WeakSAM: Segment Anything Meets Weakly-supervised Instance-level Recognition | Lianghui Zhu et.al. | 2402.14812v1 | link |
2024-02-22 | Fine-Tuning Enhances Existing Mechanisms: A Case Study on Entity Tracking | Nikhil Prakash et.al. | 2402.14811v1 | null |
2024-02-22 | GeneOH Diffusion: Towards Generalizable Hand-Object Interaction Denoising via Denoising Diffusion | Xueyi Liu et.al. | 2402.14810v1 | link |
2024-02-22 | CriticBench: Benchmarking LLMs for Critique-Correct Reasoning | Zicheng Lin et.al. | 2402.14809v1 | link |
2024-02-22 | RelayAttention for Efficient Large Language Model Serving with Long System Prompts | Lei Zhu et.al. | 2402.14808v1 | link |
2024-02-22 | A Decision-Language Model (DLM) for Dynamic Restless Multi-Armed Bandit Tasks in Public Health | Nikhil Behari et.al. | 2402.14807v1 | null |
2024-02-22 | Identifying Multiple Personalities in Large Language Models with External Evaluation | Xiaoyang Song et.al. | 2402.14805v1 | null |
2024-02-21 | D-Flow: Differentiating through Flows for Controlled Generation | Heli Ben-Hamu et.al. | 2402.14017v1 | null |
2024-02-21 | Corrective Machine Unlearning | Shashwat Goel et.al. | 2402.14015v1 | link |
2024-02-21 | Geometry-Informed Neural Networks | Arturs Berzins et.al. | 2402.14009v1 | null |
2024-02-21 | OlympiadBench: A Challenging Benchmark for Promoting AGI with Olympiad-Level Bilingual Multimodal Scientific Problems | Chaoqun He et.al. | 2402.14008v1 | link |
2024-02-21 | Hallucinations or Attention Misdirection? The Path to Strategic Value Extraction in Business Using Large Language Models | Aline Ioste et.al. | 2402.14002v1 | null |
2024-02-21 | Real-time 3D-aware Portrait Editing from a Single Image | Qingyan Bai et.al. | 2402.14000v1 | null |
2024-02-20 | CounterCurate: Enhancing Physical and Semantic Visio-Linguistic Compositional Reasoning via Counterfactual Examples | Jianrui Zhang et.al. | 2402.13254v1 | link |
2024-02-20 | BiMediX: Bilingual Medical Mixture of Experts LLM | Sara Pieri et.al. | 2402.13253v1 | link |
2024-02-20 | Video ReCap: Recursive Captioning of Hour-Long Videos | Md Mohaiminul Islam et.al. | 2402.13250v1 | null |
2024-02-20 | TofuEval: Evaluating Hallucinations of LLMs on Topic-Focused Dialogue Summarization | Liyan Tang et.al. | 2402.13249v1 | link |
2024-02-20 | Are Fact-Checking Tools Reliable? An Evaluation of Google Fact Check | Qiangeng Yang et.al. | 2402.13244v1 | null |
2024-02-20 | Unlocking Insights: Semantic Search in Jupyter Notebooks | Lan Li et.al. | 2402.13234v1 | null |
2024-02-20 | A Touch, Vision, and Language Dataset for Multimodal Alignment | Letian Fu et.al. | 2402.13232v1 | link |
2024-02-19 | FiT: Flexible Vision Transformer for Diffusion Model | Zeyu Lu et.al. | 2402.12376v1 | link |
2024-02-19 | A synthetic data approach for domain generalization of NLI models | Mohammad Javad Hosseini et.al. | 2402.12368v1 | null |
2024-02-19 | A Critical Evaluation of AI Feedback for Aligning Large Language Models | Archit Sharma et.al. | 2402.12366v1 | link |
2024-02-19 | Almost-linear time parameterized algorithm for rankwidth via dynamic rankwidth | Tuukka Korhonen et.al. | 2402.12364v1 | null |
2024-02-19 | Flip Graphs of Pseudo-Triangulations With Face Degree at Most 4 | Maarten Löffler et.al. | 2402.12357v1 | null |
2024-02-19 | Graph-Based Retriever Captures the Long Tail of Biomedical Knowledge | Julien Delile et.al. | 2402.12352v1 | null |
2024-02-16 | Fusion of Diffusion Weighted MRI and Clinical Data for Predicting Functional Outcome after Acute Ischemic Stroke with Deep Contrastive Learning | Chia-Ling Tsai et.al. | 2402.10894v1 | null |
2024-02-16 | RLVF: Learning from Verbal Feedback without Overgeneralization | Moritz Stephan et.al. | 2402.10893v1 | link |
2024-02-16 | Instruction Diversity Drives Generalization To Unseen Tasks | Dylan Zhang et.al. | 2402.10891v1 | null |
2024-02-16 | When is Tree Search Useful for LLM Planning? It Depends on the Discriminator | Ziru Chen et.al. | 2402.10890v1 | link |
2024-02-16 | Evaluation of EAP Usage for Authenticating Eduroam Users in 5G Networks | Leonardo Azalim de Oliveira et.al. | 2402.10889v1 | null |
2024-02-16 | Explainability for Machine Learning Models: From Data Adaptability to User Perception | julien Delaunay et.al. | 2402.10888v1 | null |
2024-02-16 | Reviewer2: Optimizing Review Generation Through Prompt Generation | Zhaolin Gao et.al. | 2402.10886v1 | null |
2024-02-16 | 3D Diffuser Actor: Policy Diffusion with 3D Scene Representations | Tsung-Wei Ke et.al. | 2402.10885v1 | null |
2024-02-15 | Self-Play Fine-Tuning of Diffusion Models for Text-to-Image Generation | Huizhuo Yuan et.al. | 2402.10210v1 | null |
2024-02-15 | Recovering the Pre-Fine-Tuning Weights of Generative Models | Eliahu Horwitz et.al. | 2402.10208v1 | link |
2024-02-15 | Rewards-in-Context: Multi-objective Alignment of Foundation Models with Dynamic Preference Adjustment | Rui Yang et.al. | 2402.10207v1 | link |
2024-02-15 | Unlocking the Potential of Transformers in Time Series Forecasting with Sharpness-Aware Minimization and Channel-Wise Attention | Romain Ilbert et.al. | 2402.10198v1 | link |
2024-02-15 | BitDelta: Your Fine-Tune May Only Be Worth One Bit | James Liu et.al. | 2402.10193v1 | link |
2024-02-15 | Multi-Excitation Projective Simulation with a Many-Body Physics Inspired Inductive Bias | Philip A. LeMaitre et.al. | 2402.10192v1 | link |
2024-02-15 | FedAnchor: Enhancing Federated Semi-Supervised Learning with Label Contrastive Loss for Unlabeled Clients | Xinchi Qiu et.al. | 2402.10191v1 | null |
2024-02-14 | AQA-Bench: An Interactive Benchmark for Evaluating LLMs’ Sequential Reasoning Ability | Siwei Yang et.al. | 2402.09404v1 | link |
2024-02-14 | Reinforcement Learning from Human Feedback with Active Queries | Kaixuan Ji et.al. | 2402.09401v1 | null |
2024-02-14 | Long-form evaluation of model editing | Domenic Rosati et.al. | 2402.09394v1 | null |
2024-02-14 | Introduction to Physically Unclonable Fuctions: Properties and Applications | M. Garcia-Bosque et.al. | 2402.09386v1 | null |
2024-02-14 | GraSSRep: Graph-Based Self-Supervised Learning for Repeat Detection in Metagenomic Assembly | Ali Azizpour et.al. | 2402.09381v1 | link |
2024-02-13 | IM-3D: Iterative Multiview Diffusion and Reconstruction for High-Quality 3D Generation | Luke Melas-Kyriazi et.al. | 2402.08682v1 | null |
2024-02-13 | Mitigating Object Hallucination in Large Vision-Language Models via Classifier-Free Guidance | Linxi Zhao et.al. | 2402.08680v1 | null |
2024-02-13 | COLD-Attack: Jailbreaking LLMs with Stealthiness and Controllability | Xingang Guo et.al. | 2402.08679v1 | link |
2024-02-13 | Graph Mamba: Towards Learning on Graphs with State Space Models | Ali Behrouz et.al. | 2402.08678v1 | link |
2024-02-13 | Model Assessment and Selection under Temporal Distribution Shift | Elise Han et.al. | 2402.08672v1 | link |
2024-02-13 | Rec-GPT4V: Multimodal Recommendation with Large Vision-Language Models | Yuqing Liu et.al. | 2402.08670v1 | null |
2024-02-13 | Improving Generalization in Semantic Parsing by Increasing Natural Language Variation | Irina Saparina et.al. | 2402.08666v1 | link |
2024-02-12 | A systematic investigation of learnability from single child linguistic input | Yulu Qin et.al. | 2402.07899v1 | null |
2024-02-12 | Label-Efficient Model Selection for Text Generation | Shir Ashury-Tahan et.al. | 2402.07891v1 | null |
2024-02-12 | Toward an Android Static Analysis Approach for Data Protection | Mugdha Khedkar et.al. | 2402.07889v1 | null |
2024-02-12 | WildfireGPT: Tailored Large Language Model for Wildfire Analysis | Yangxinyu Xie et.al. | 2402.07877v1 | null |
2024-02-12 | Policy Improvement using Language Feedback Models | Victor Zhong et.al. | 2402.07876v1 | null |
2024-02-09 | Feedback Loops With Language Models Drive In-Context Reward Hacking | Alexander Pan et.al. | 2402.06627v1 | link |
2024-02-09 | Understanding the Effects of Iterative Prompting on Truthfulness | Satyapriya Krishna et.al. | 2402.06625v1 | null |
2024-02-09 | A two-stage algorithm in evolutionary product unit neural networks for classification | Antonio J. Tallón-Ballesteros et.al. | 2402.06622v1 | null |
2024-02-09 | TIC: Translate-Infer-Compile for accurate ‘text to plan’ using LLMs and logical intermediate representations | Sudhir Agarwal et.al. | 2402.06608v1 | null |
2024-02-09 | On the Out-Of-Distribution Generalization of Multimodal Large Language Models | Xingxuan Zhang et.al. | 2402.06599v1 | null |
2024-02-09 | CigaR: Cost-efficient Program Repair with LLMs | Dávid Hidvégi et.al. | 2402.06598v1 | link |
2024-02-09 | Understanding the Weakness of Large Language Model Agents within a Complex Android Environment | Mingzhe Xing et.al. | 2402.06596v1 | link |
2024-02-08 | InstaGen: Enhancing Object Detection by Training on Synthetic Dataset | Chengjian Feng et.al. | 2402.05937v1 | null |
2024-02-08 | SPHINX-X: Scaling Data and Parameters for a Family of Multi-modal Large Language Models | Peng Gao et.al. | 2402.05935v1 | link |
2024-02-08 | Time Series Diffusion in the Frequency Domain | Jonathan Crabbé et.al. | 2402.05933v1 | link |
2024-02-08 | WebLINX: Real-World Website Navigation with Multi-Turn Dialogue | Xing Han Lù et.al. | 2402.05930v1 | link |
2024-02-08 | An Interactive Agent Foundation Model | Zane Durante et.al. | 2402.05929v1 | null |
2024-02-08 | Sharp Rates in Dependent Learning Theory: Avoiding Sample Size Deflation for the Square Loss | Ingvar Ziemann et.al. | 2402.05928v1 | null |
2024-02-07 | Image captioning for Brazilian Portuguese using GRIT model | Rafael Silva de Alencar et.al. | 2402.05106v1 | null |
2024-02-07 | You Can REST Now: Automated Specification Inference and Black-Box Testing of RESTful APIs with Large Language Models | Alix Decrop et.al. | 2402.05102v1 | null |
2024-02-07 | Hydragen: High-Throughput LLM Inference with Shared Prefixes | Jordan Juravsky et.al. | 2402.05099v1 | null |
2024-02-07 | On diffusion models for amortized inference: Benchmarking and improving stochastic control and sampling | Marcin Sendera et.al. | 2402.05098v1 | link |
2024-02-07 | Language-Based Augmentation to Address Shortcut Learning in Object Goal Navigation | Dennis Hoftijzer et.al. | 2402.05090v1 | null |
2024-02-07 | Hyperspectral acquisition with ScanImage at the single pixel level: Application to time domain coherent Raman imaging | Samuel Metais et.al. | 2402.05086v1 | null |
2024-02-06 | Linear-time Minimum Bayes Risk Decoding with Reference Aggregation | Jannis Vamvas et.al. | 2402.04251v1 | link |
2024-02-06 | CAST: Clustering Self-Attention using Surrogate Tokens for Efficient Transformers | Adjorn van Engelenhoven et.al. | 2402.04239v1 | null |
2024-02-06 | CogCoM: Train Large Vision-Language Models Diving into Details through Chain of Manipulations | Ji Qi et.al. | 2402.04236v1 | link |
2024-02-06 | Role of spontaneously generated coherence (SGC) in laser cooling of atoms | Rajnandan Choudhury Das et.al. | 2402.04234v1 | null |
2024-02-06 | Can Generative Agents Predict Emotion? | Ciaran Regan et.al. | 2402.04232v1 | null |
2024-02-06 | Further Constructions of AMUBs for Non-prime power Composite Dimensions | Ajeet Kumar et.al. | 2402.04231v1 | null |
2024-02-05 | Do Diffusion Models Learn Semantically Meaningful and Efficient Representations? | Qiyao Liang et.al. | 2402.03305v1 | null |
2024-02-05 | GUARD: Role-playing to Generate Natural-language Jailbreakings to Test Guideline Adherence of Large Language Models | Haibo Jin et.al. | 2402.03299v1 | null |
2024-02-05 | Ginger: An Efficient Curvature Approximation with Linear Complexity for General Neural Networks | Yongchang Hao et.al. | 2402.03295v1 | null |
2024-02-05 | InstanceDiffusion: Instance-level Control for Image Generation | Xudong Wang et.al. | 2402.03290v1 | link |
2024-02-05 | Make Every Move Count: LLM-based High-Quality RTL Code Generation Using MCTS | Matthew DeLorenzo et.al. | 2402.03289v1 | null |
2024-02-05 | A Lennard-Jones Layer for Distribution Normalization | Mulun Na et.al. | 2402.03287v1 | null |
2024-02-05 | Training-Free Consistent Text-to-Image Generation | Yoad Tewel et.al. | 2402.03286v1 | null |
2024-02-05 | Towards a Flexible Scale-out Framework for Efficient Visual Data Query Processing | Rohit Verma et.al. | 2402.03283v1 | null |
2024-02-02 | Position Paper: Generalized grammar rules and structure-based generalization beyond classical equivariance for lexical tasks and transduction | Mircea Petrache et.al. | 2402.01629v1 | null |
2024-02-02 | Stochastic Two Points Method for Deep Model Zeroth-order Optimization | Yijiang Pang et.al. | 2402.01621v1 | null |
2024-02-02 | MAGDi: Structured Distillation of Multi-Agent Interaction Graphs Improves Reasoning in Smaller Language Models | Justin Chih-Yao Chen et.al. | 2402.01620v1 | link |
2024-02-02 | Style Vectors for Steering Generative Large Language Model | Kai Konen et.al. | 2402.01618v1 | link |
2024-02-02 | A GP-based Robust Motion Planning Framework for Agile Autonomous Robot Navigation and Recovery in Unknown Environments | Nicholas Mohammad et.al. | 2402.01617v1 | null |
2024-02-01 | AToM: Amortized Text-to-Mesh using 2D Diffusion | Guocheng Qian et.al. | 2402.00867v1 | null |
2024-02-01 | Towards Optimal Feature-Shaping Methods for Out-of-Distribution Detection | Qinyu Zhao et.al. | 2402.00865v1 | link |
2024-02-01 | Evaluating Large Language Models for Generalization and Robustness via Data Compression | Yucheng Li et.al. | 2402.00861v1 | link |
2024-02-01 | Can Large Language Models Understand Context? | Yilun Zhu et.al. | 2402.00858v1 | null |
2024-02-01 | SymbolicAI: A framework for logic-based approaches combining generative models and solvers | Marius-Constantin Dinu et.al. | 2402.00854v1 | link |
2024-02-01 | LTAU-FF: Loss Trajectory Analysis for Uncertainty in Atomistic Force Fields | Joshua A. Vita et.al. | 2402.00853v1 | null |
2024-01-31 | Motion Guidance: Diffusion-Based Image Editing with Differentiable Motion Estimators | Daniel Geng et.al. | 2401.18085v1 | null |
2024-01-31 | Improved Scene Landmark Detection for Camera Localization | Tien Do et.al. | 2401.18083v1 | link |
2024-01-31 | Do Language Models Exhibit the Same Cognitive Biases in Problem Solving as Human Learners? | Andreas Opedal et.al. | 2401.18070v1 | null |
2024-01-30 | A simple, strong baseline for building damage detection on the xBD dataset | Sebastian Gerard et.al. | 2401.17271v1 | link |
2024-01-30 | Weaver: Foundation Models for Creative Writing | Tiannan Wang et.al. | 2401.17268v1 | null |
2024-01-30 | Proactive Detection of Voice Cloning with Localized Watermarking | Robin San Roman et.al. | 2401.17264v1 | link |
2024-01-30 | Weak-to-Strong Jailbreaking on Large Language Models | Xuandong Zhao et.al. | 2401.17256v1 | link |
2024-01-29 | Endo-4DGS: Distilling Depth Ranking for Endoscopic Monocular Scene Reconstruction with 4D Gaussian Splatting | Yiming Huang et.al. | 2401.16416v1 | null |
2024-01-29 | A Survey on Visual Anomaly Detection: Challenge, Approach, and Prospect | Yunkang Cao et.al. | 2401.16402v1 | null |
2024-01-29 | Amazon’s 2023 Drought: Sentinel-1 Reveals Extreme Rio Negro River Contraction | Fabien H Wagner et.al. | 2401.16393v1 | null |
2024-01-26 | EAGLE: Speculative Sampling Requires Rethinking Feature Uncertainty | Yuhui Li et.al. | 2401.15077v1 | link |
2024-01-26 | Annotated Hands for Generative Models | Yue Yang et.al. | 2401.15075v1 | link |
2024-01-26 | From GPT-4 to Gemini and Beyond: Assessing the Landscape of MLLMs on Generalizability, Trustworthiness and Causality through Four Modalities | Chaochao Lu et.al. | 2401.15071v1 | null |
2024-01-26 | Pairing Orthographically Variant Literary Words to Standard Equivalents Using Neural Edit Distance Models | Craig Messner et.al. | 2401.15068v1 | null |
2024-01-26 | Asymmetric Influence of the Amplitude-Dependent Tune Shift on the Transverse Mode-Coupling Instability | Miriam Brosi et.al. | 2401.15065v1 | null |
2024-01-26 | Expert with Clustering: Hierarchical Online Preference Learning Framework | Tianyue Zhou et.al. | 2401.15062v1 | null |
2024-01-25 | Deconstructing Denoising Diffusion Models for Self-Supervised Learning | Xinlei Chen et.al. | 2401.14404v1 | null |
2024-01-25 | O(1) Insertion for Random Walk d-ary Cuckoo Hashing up to the Load Threshold | Tolson Bell et.al. | 2401.14394v1 | null |
2024-01-25 | Inconsistency Masks: Removing the Uncertainty from Input-Pseudo-Label Pairs | Michael R. H. Vorndran et.al. | 2401.14387v1 | link |
2024-01-25 | Manifold GCN: Diffusion-based Convolutional Neural Network for Manifold-valued Graphs | Martin Hanik et.al. | 2401.14381v1 | null |
2024-01-25 | UrbanGenAI: Reconstructing Urban Landscapes using Panoptic Segmentation and Diffusion Models | Timo Kapsalis et.al. | 2401.14379v1 | null |
2024-01-24 | Graph-Informed Neural Networks for Sparse Grid-Based Discontinuity Detectors | Francesco Della Santa et.al. | 2401.13652v1 | link |
2024-01-24 | Employing polyhedral methods to optimize stencils on FPGAs with stencil-specific caches, data reuse, and wide data bursts | Florian Mayer et.al. | 2401.13645v1 | null |
2024-01-24 | Unveiling homophily beyond the pool of opportunities | Sina Sajjadi et.al. | 2401.13642v1 | null |
2024-01-23 | GALA: Generating Animatable Layered Assets from a Single Scan | Taeksoo Kim et.al. | 2401.12979v1 | null |
2024-01-23 | Zero-Shot Learning for the Primitives of 3D Affordance in General Objects | Hyeonwoo Kim et.al. | 2401.12978v1 | null |
2024-01-23 | In-Context Language Learning: Arhitectures and Algorithms | Ekin Akyürek et.al. | 2401.12973v1 | link |
2024-01-23 | Raidar: geneRative AI Detection viA Rewriting | Chengzhi Mao et.al. | 2401.12970v1 | link |
2024-01-23 | Minimizing the Age of Two Heterogeneous Sources With Packet Drops Via Cyclic Schedulers | Sahan Liyanaarachchi et.al. | 2401.12962v1 | null |
2024-01-23 | Chatterbox: Robust Transport for LLM Token Streaming under Unstable Network | Hanchen Li et.al. | 2401.12961v1 | null |
2024-01-22 | Exploring Simple Open-Vocabulary Semantic Segmentation | Zihang Lai et.al. | 2401.12217v1 | link |
2024-01-22 | Genericity Through Stratification | Victor Arrial et.al. | 2401.12212v1 | null |
2024-01-22 | OK-Robot: What Really Matters in Integrating Open-Knowledge Models for Robotics | Peiqi Liu et.al. | 2401.12202v1 | link |
2024-01-22 | APT: Adaptive Pruning and Tuning Pretrained Language Models for Efficient Training and Inference | Bowen Zhao et.al. | 2401.12200v1 | null |
2024-01-22 | Learning Dynamics from Multicellular Graphs with Deep Neural Networks | Haiqian Yang et.al. | 2401.12196v1 | null |
2024-01-22 | Text Embedding Inversion Attacks on Multilingual Language Models | Yiyi Chen et.al. | 2401.12192v1 | null |
2024-01-19 | Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data | Lihe Yang et.al. | 2401.10891v1 | link |
2024-01-19 | Event detection from novel data sources: Leveraging satellite imagery alongside GPS traces | Ekin Ugurel et.al. | 2401.10890v1 | link |
2024-01-19 | Synthesizing Moving People with 3D Control | Boyi Li et.al. | 2401.10889v1 | null |
2024-01-19 | Pruning for Protection: Increasing Jailbreak Resistance in Aligned LLMs Without Fine-Tuning | Adib Hasan et.al. | 2401.10862v1 | link |
2024-01-18 | ParaHome: Parameterizing Everyday Home Activities Towards 3D Generative Modeling of Human-Object Interactions | Jeonghwan Kim et.al. | 2401.10232v1 | null |
2024-01-18 | Simultaneous Tactile Estimation and Control for Extrinsic Dexterity | Antonia Bronars et.al. | 2401.10230v1 | null |
2024-01-18 | RAP-SAM: Towards Real-Time All-Purpose Segment Anything | Shilin Xu et.al. | 2401.10228v1 | link |
2024-01-18 | A Simple Latent Diffusion Approach for Panoptic Segmentation and Mask Inpainting | Wouter Van Gansbeke et.al. | 2401.10227v1 | link |
2024-01-18 | The Manga Whisperer: Automatically Generating Transcriptions for Comics | Ragav Sachdeva et.al. | 2401.10224v1 | link |
2024-01-18 | Supervised Fine-tuning in turn Improves Visual Foundation Models | Xiaohu Jiang et.al. | 2401.10222v1 | link |
2024-01-18 | AutoFT: Robust Fine-Tuning by Optimizing Hyperparameters on OOD Data | Caroline Choi et.al. | 2401.10220v1 | null |
2024-01-18 | Explaining the Implicit Neural Canvas: Connecting Pixels to Neurons by Tracing their Contributions | Namitha Padmanabhan et.al. | 2401.10217v1 | null |
2024-01-18 | GPAvatar: Generalizable and Precise Head Avatar from Image(s) | Xuangeng Chu et.al. | 2401.10215v1 | link |
2024-01-17 | Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model | Lianghui Zhu et.al. | 2401.09417v1 | link |
2024-01-17 | Vlogger: Make Your Dream A Vlog | Shaobin Zhuang et.al. | 2401.09414v1 | link |
2024-01-17 | Deciphering Textual Authenticity: A Generalized Strategy through the Lens of Large Language Semantics for Detecting Human vs. Machine-Generated Text | Mazal Bethany et.al. | 2401.09407v1 | null |
2024-01-16 | Machine Translation with Large Language Models: Prompt Engineering for Persian, English, and Russian Directions | Nooshin Pourkamali et.al. | 2401.08429v1 | null |
2024-01-16 | Three ways that non-differentiability affects neural network training | Siddharth Krishna Kumar et.al. | 2401.08426v1 | null |
2024-01-16 | U-DIADS-Bib: a full and few-shot pixel-precise dataset for document layout analysis of ancient manuscripts | Silvia Zottin et.al. | 2401.08425v1 | null |
2024-01-16 | Ask the experts: sourcing high-quality datasets for nutritional counselling through Human-AI collaboration | Simone Balloccu et.al. | 2401.08420v1 | link |
2024-01-16 | Contrastive Preference Optimization: Pushing the Boundaries of LLM Performance in Machine Translation | Haoran Xu et.al. | 2401.08417v1 | link |
2024-01-12 | Automated Test Case Repair Using Language Models | Ahmadreza Saboor Yaraghi et.al. | 2401.06765v1 | null |
2024-01-12 | APAR: LLMs Can Do Auto-Parallel Auto-Regressive Decoding | Mingdao Liu et.al. | 2401.06761v1 | null |
2024-01-12 | Synthetic Data Generation Framework, Dataset, and Efficient Deep Model for Pedestrian Intention Prediction | Muhammad Naveed Riaz et.al. | 2401.06757v1 | null |
2024-01-12 | Stylometry Analysis of Multi-authored Documents for Authorship and Author Style Change Detection | Muhammad Tayyab Zamir et.al. | 2401.06752v1 | null |
2024-01-12 | The Unreasonable Effectiveness of Easy Training Data for Hard Tasks | Peter Hase et.al. | 2401.06751v1 | link |
2024-01-12 | Measure Theoretic Reeb Graphs and Reeb Spaces | Qingsong Wang et.al. | 2401.06748v1 | null |
2024-01-11 | Distilling Vision-Language Models on Millions of Videos | Yue Zhao et.al. | 2401.06129v1 | null |
2024-01-11 | E $^{2}$ GAN: Efficient Training of Efficient GANs for Image-to-Image Translation | Yifan Gong et.al. | 2401.06127v1 | null |
2024-01-11 | Dubbing for Everyone: Data-Efficient Visual Dubbing using Neural Rendering Priors | Jack Saunders et.al. | 2401.06126v1 | null |
2024-01-11 | Manipulating Feature Visualizations with Gradient Slingshots | Dilyara Bareeva et.al. | 2401.06122v1 | link |
2024-01-11 | Gaussian Shadow Casting for Neural Characters | Luis Bolanos et.al. | 2401.06116v1 | null |
2024-01-11 | Jupyter widgets and extensions for education and research in computational physics and chemistry | Dou Du et.al. | 2401.06113v1 | null |
2024-01-10 | InseRF: Text-Driven Generative Object Insertion in Neural 3D Scenes | Mohamad Shahbazi et.al. | 2401.05335v1 | null |
2024-01-10 | URHand: Universal Relightable Hands | Zhaoxi Chen et.al. | 2401.05334v1 | null |
2024-01-10 | \textit{SmartMME}: Implementation of Base Station Switching Off Strategy in ns-3 | Argha Sen et.al. | 2401.05329v1 | null |
2024-01-10 | Leveraging Print Debugging to Improve Code Generation in Large Language Models | Xueyu Hu et.al. | 2401.05319v1 | null |
2024-01-10 | Can Probabilistic Feedback Drive User Impacts in Online Platforms? | Jessica Dai et.al. | 2401.05304v1 | null |
2024-01-09 | Morphable Diffusion: 3D-Consistent Diffusion for Single-image Avatar Creation | Xiyi Chen et.al. | 2401.04728v1 | null |
2024-01-09 | Low-Resource Vision Challenges for Foundation Models | Yunhua Zhang et.al. | 2401.04716v1 | null |
2024-01-09 | Bin Packing under Random-Order: Breaking the Barrier of 3/2 | Anish Hebbar et.al. | 2401.04714v1 | link |
2024-01-09 | RNA-TransCrypt: Image Encryption Using Chaotic RNA Encoding, Novel Transformative Substitution, and Tailored Cryptographic Operations | Muhammad Shahbaz Khan et.al. | 2401.04707v1 | null |
2024-01-08 | AGG: Amortized Generative 3D Gaussians for Single Image to 3D | Dejia Xu et.al. | 2401.04099v1 | null |
2024-01-08 | Modeling AoII in Push- and Pull-Based Sampling of Continuous Time Markov Chains | Ismail Cosandal et.al. | 2401.04098v1 | null |
2024-01-08 | GPT-4V(ision) is a Human-Aligned Evaluator for Text-to-3D Generation | Tong Wu et.al. | 2401.04092v1 | link |
2024-01-08 | Mixtral of Experts | Albert Q. Jiang et.al. | 2401.04088v1 | null |
2024-01-05 | Denoising Vision Transformers | Jiawei Yang et.al. | 2401.02957v1 | link |
2024-01-05 | Locally Adaptive Neural 3D Morphable Models | Michail Tarasiou et.al. | 2401.02937v1 | link |
2024-01-05 | Towards ASR Robust Spoken Language Understanding Through In-Context Learning With Word Confusion Networks | Kevin Everson et.al. | 2401.02921v1 | null |
2024-01-04 | Learning to Prompt with Text Only Supervision for Vision-Language Models | Muhammad Uzair Khattak et.al. | 2401.02418v1 | link |
2024-01-04 | LLaMA Pro: Progressive LLaMA with Block Expansion | Chengyue Wu et.al. | 2401.02415v1 | link |
2024-01-04 | LLM Augmented LLMs: Expanding Capabilities through Composition | Rachit Bansal et.al. | 2401.02412v1 | null |
2024-01-04 | What You See is What You GAN: Rendering Every Pixel for High-Fidelity Geometry in 3D GANs | Alex Trevithick et.al. | 2401.02411v1 | null |
2024-01-04 | Correctness Comparison of ChatGPT-4, Bard, Claude-2, and Copilot for Spatial Tasks | Hartwig H. Hochmair et.al. | 2401.02404v1 | null |
2024-01-04 | 3D Open-Vocabulary Panoptic Segmentation with 2D-3D Vision-Language Distillation | Zihao Xiao et.al. | 2401.02402v1 | null |
2024-01-04 | Learning the 3D Fauna of the Web | Zizhang Li et.al. | 2401.02400v1 | null |
2024-01-03 | From Audio to Photoreal Embodiment: Synthesizing Humans in Conversations | Evonne Ng et.al. | 2401.01885v1 | link |
2024-01-03 | A rewriting-logic-with-SMT-based formal analysis and parameter synthesis framework for parametric time Petri nets | Jaime Arias et.al. | 2401.01884v1 | null |
2024-01-03 | Theoretical guarantees on the best-of-n alignment policy | Ahmad Beirami et.al. | 2401.01879v1 | null |
2024-01-03 | Graph Neural Networks for Surfactant Multi-Property Prediction | Christoforos Brozos et.al. | 2401.01874v1 | link |
2024-01-03 | Dataset Difficulty and the Role of Inductive Bias | Devin Kwok et.al. | 2401.01867v1 | null |
2024-01-02 | Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models | Zixiang Chen et.al. | 2401.01335v1 | link |
2024-01-02 | An Autoregressive Text-to-Graph Framework for Joint Entity and Relation Extraction | Zaratiana Urchade et.al. | 2401.01326v1 | link |
2024-01-02 | A Comprehensive Survey of Hallucination Mitigation Techniques in Large Language Models | S. M Towhidul Islam Tonmoy et.al. | 2401.01313v1 | null |
2024-01-02 | On the uniqueness and computation of commuting extensions | Pascal Koiran et.al. | 2401.01302v1 | null |
2023-12-29 | K-PERM: Personalized Response Generation Using Dynamic Knowledge Retrieval and Persona-Adaptive Queries | Kanak Raj et.al. | 2312.17748v1 | link |
2023-12-28 | Do Androids Know They’re Only Dreaming of Electric Sheep? | Sky CH-Wang et.al. | 2312.17249v1 | null |
2023-12-28 | Rethinking Model-based, Policy-based, and Value-based Reinforcement Learning via the Lens of Representation Complexity | Guhao Feng et.al. | 2312.17248v1 | null |
2023-12-28 | The LLM Surgeon | Tycho F. A. van der Ouderaa et.al. | 2312.17244v1 | link |
2023-12-28 | Unsupervised Universal Image Segmentation | Dantong Niu et.al. | 2312.17243v1 | link |
2023-12-28 | Learning to Generate Text in Arbitrary Writing Styles | Aleem Khan et.al. | 2312.17242v1 | null |
2023-12-28 | An Improved Baseline for Reasoning Segmentation with Large Language Model | Senqiao Yang et.al. | 2312.17240v1 | null |
2023-12-28 | Fast Inference of Mixture-of-Experts Language Models with Offloading | Artyom Eliseev et.al. | 2312.17238v1 | link |
2023-12-28 | A Simple LLM Framework for Long-Range Video Question-Answering | Ce Zhang et.al. | 2312.17235v1 | link |
2023-12-28 | Personalized Restoration via Dual-Pivot Tuning | Pradyumna Chari et.al. | 2312.17234v1 | null |
2023-12-26 | Social-Transmotion: Promptable Human Trajectory Prediction | Saeed Saadatnejad et.al. | 2312.16168v1 | link |
2023-12-26 | Age of Information in Gossip Networks: A Friendly Introduction and Literature Survey | Priyanka Kaswan et.al. | 2312.16163v1 | null |
2023-12-26 | Zero-Shot Cross-Lingual Reranking with Large Language Models for Low-Resource Languages | Mofetoluwa Adeyemi et.al. | 2312.16159v1 | null |
2023-12-26 | From Text to Multimodal: A Comprehensive Survey of Adversarial Example Generation in Question Answering Systems | Gulsum Yigit et.al. | 2312.16156v1 | null |
2023-12-26 | Validating Light Phenomena Conceptual Assessment Through The Lens of CTT and IRT Frameworks | Purwoko Haryadi Santoso et.al. | 2312.16153v1 | null |
2023-12-26 | SoundCount: Sound Counting from Raw Audio with Dyadic Decomposition Neural Network | Yuhang He et.al. | 2312.16149v1 | null |
2023-12-22 | MACS: Mass Conditioned 3D Hand and Object Motion Synthesis | Soshi Shimada et.al. | 2312.14929v1 | null |
2023-12-22 | PoseGen: Learning to Generate 3D Human Pose Dataset with NeRF | Mohsen Gholami et.al. | 2312.14915v1 | link |
2023-12-21 | Virtual Pets: Animatable Animal Generation in 3D Scenes | Yen-Chi Cheng et.al. | 2312.14154v1 | null |
2023-12-21 | DriveLM: Driving with Graph Visual Question Answering | Chonghao Sima et.al. | 2312.14150v1 | link |
2023-12-21 | HeadCraft: Modeling High-Detail Shape Variations for Animated 3DMMs | Artem Sevastopolsky et.al. | 2312.14140v1 | null |
2023-12-21 | Diffusion Reward: Learning Rewards via Conditional Video Diffusion | Tao Huang et.al. | 2312.14134v1 | null |
2023-12-20 | Generative Multimodal Models are In-Context Learners | Quan Sun et.al. | 2312.13286v1 | link |
2023-12-20 | UniSDF: Unifying Neural Representations for High-Fidelity 3D Reconstruction of Complex Scenes with Reflections | Fangjinhua Wang et.al. | 2312.13285v1 | null |
2023-12-20 | Deep Learning on 3D Neural Fields | Pierluigi Zama Ramirez et.al. | 2312.13277v1 | null |
2023-12-20 | Repaint123: Fast and High-quality One Image to 3D Generation with Progressive Controllable 2D Repainting | Junwu Zhang et.al. | 2312.13271v1 | link |
2023-12-19 | Weakly Supervised Open-Vocabulary Object Detection | Jianghang Lin et.al. | 2312.12437v1 | null |
2023-12-19 | A Challenger to GPT-4V? Early Explorations of Gemini in Visual Expertise | Chaoyou Fu et.al. | 2312.12436v1 | link |
2023-12-19 | On Inference Stability for Diffusion Models | Viet Nguyen et.al. | 2312.12431v1 | link |
2023-12-19 | ROSE: A reduced-order scattering emulator for optical models | Daniel Odell et.al. | 2312.12426v1 | null |
2023-12-19 | SegRefiner: Towards Model-Agnostic Segmentation Refinement with Discrete Diffusion Process | Mengyu Wang et.al. | 2312.12425v1 | link |
2023-12-19 | Jack of All Tasks, Master of Many: Designing General-purpose Coarse-to-Fine Vision-Language Model | Shraman Pramanick et.al. | 2312.12423v1 | null |
2023-12-19 | Scene-Conditional 3D Object Stylization and Composition | Jinghao Zhou et.al. | 2312.12419v1 | null |
2023-12-18 | On Computing Makespan-Optimal Solutions for Generalized Sliding-Tile Puzzles | Marcus Gozon et.al. | 2312.10887v1 | null |
2023-12-18 | A novel diffusion recommendation algorithm based on multi-scale cnn and residual lstm | Yong Niu et.al. | 2312.10885v1 | null |
2023-12-18 | Sharable Clothoid-based Continuous Motion Planning for Connected Automated Vehicles | Sanghoon Oh et.al. | 2312.10880v1 | null |
2023-12-18 | Country-Scale Cropland Mapping in Data-Scarce Settings Using Deep Learning: A Case Study of Nigeria | Joaquin Gajardo et.al. | 2312.10872v1 | link |
2023-12-18 | From Google Gemini to OpenAI Q* (Q-Star): A Survey of Reshaping the Generative Artificial Intelligence (AI) Research Landscape | Timothy R. McIntosh et.al. | 2312.10868v1 | null |
2023-12-15 | Osprey: Pixel Understanding with Visual Instruction Tuning | Yuqian Yuan et.al. | 2312.10032v1 | link |
2023-12-15 | Wearable Coaxially-shielded Metamaterial for Magnetic Resonance Imaging | Xia Zhu et.al. | 2312.10018v1 | null |
2023-12-15 | Movement Primitive Diffusion: Learning Gentle Robotic Manipulation of Deformable Objects | Paul Maria Scheikl et.al. | 2312.10008v1 | null |
2023-12-15 | Faithful Persona-based Conversational Dataset Generation with Large Language Models | Pegah Jandaghi et.al. | 2312.10007v1 | link |
2023-12-14 | LIME: Localized Image Editing via Attention Regularization in Diffusion Models | Enis Simsar et.al. | 2312.09256v1 | null |
2023-12-14 | Revisiting Depth Completion from a Stereo Matching Perspective for Cross-domain Generalization | Luca Bartolomei et.al. | 2312.09254v1 | link |
2023-12-14 | FineControlNet: Fine-level Text Control for Image Generation with Spatially Aligned Text Control Injection | Hongsuk Choi et.al. | 2312.09252v1 | null |
2023-12-14 | VL-GPT: A Generative Pre-trained Transformer for Vision and Language Understanding and Generation | Jinguo Zhu et.al. | 2312.09251v1 | link |
2023-12-14 | Single Mesh Diffusion Models with Field Latents for Texture Generation | Thomas W. Mitchel et.al. | 2312.09250v1 | null |
2023-12-14 | ZeroRF: Fast Sparse View 360° Reconstruction with Zero Pretraining | Ruoxi Shi et.al. | 2312.09249v1 | null |
2023-12-14 | Helping or Herding? Reward Model Ensembles Mitigate but do not Eliminate Reward Hacking | Jacob Eisenstein et.al. | 2312.09244v1 | null |
2023-12-14 | OccNeRF: Self-Supervised Multi-Camera Occupancy Prediction with Neural Radiance Fields | Chubin Zhang et.al. | 2312.09243v1 | link |
2023-12-14 | Text2Immersion: Generative Immersive Scene with 3D Gaussians | Hao Ouyang et.al. | 2312.09242v1 | null |
2023-12-13 | SAM-guided Graph Cut for 3D Instance Segmentation | Haoyu Guo et.al. | 2312.08372v1 | null |
2023-12-13 | PTT: Point-Trajectory Transformer for Efficient Temporal 3D Object Detection | Kuan-Chih Huang et.al. | 2312.08371v1 | link |
2023-12-13 | An Invitation to Deep Reinforcement Learning | Bernhard Jaeger et.al. | 2312.08365v1 | null |
2023-12-13 | View-Dependent Octree-based Mesh Extraction in Unbounded Scenes for Procedural Synthetic Data | Zeyu Ma et.al. | 2312.08364v1 | link |
2023-12-13 | On the Computational Hardness of Quantum One-Wayness | Bruno Cavalar et.al. | 2312.08363v1 | null |
2023-12-13 | Distributed Inference and Fine-tuning of Large Language Models Over The Internet | Alexander Borzunov et.al. | 2312.08361v1 | null |
2023-12-12 | diff History for Long-Context Language Agents | Ulyana Piterbarg et.al. | 2312.07540v1 | link |
2023-12-12 | HeadArtist: Text-conditioned 3D Head Generation with Self Score Distillation | Hongyu Liu et.al. | 2312.07539v1 | null |
2023-12-12 | FreeInit: Bridging Initialization Gap in Video Diffusion Models | Tianxing Wu et.al. | 2312.07537v1 | link |
2023-12-12 | FreeControl: Training-Free Spatial Control of Any Text-to-Image Diffusion Model with Any Condition | Sicheng Mo et.al. | 2312.07536v1 | null |
2023-12-12 | Interfacing Foundation Models’ Embeddings | Xueyan Zou et.al. | 2312.07532v1 | link |
2023-12-12 | Topological Obstructions and How to Avoid Them | Babak Esmaeili et.al. | 2312.07529v1 | null |
2023-12-11 | CAD: Photorealistic 3D Generation via Adversarial Distillation | Ziyu Wan et.al. | 2312.06663v1 | null |
2023-12-11 | Photorealistic Video Generation with Diffusion Models | Agrim Gupta et.al. | 2312.06662v1 | null |
2023-12-11 | UpFusion: Novel View Diffusion from Unposed Sparse View Observations | Bharath Raj Nagoor Kani et.al. | 2312.06661v1 | null |
2023-12-11 | EdgeSAM: Prompt-In-the-Loop Distillation for On-Device Deployment of SAM | Chong Zhou et.al. | 2312.06660v1 | link |
2023-12-11 | Sherpa3D: Boosting High-Fidelity Text-to-3D Generation via Coarse 3D Prior | Fangfu Liu et.al. | 2312.06655v1 | link |
2023-12-11 | LightSim: Neural Lighting Simulation for Urban Scenes | Ava Pun et.al. | 2312.06654v1 | null |
2023-12-11 | Adaptive Human Trajectory Prediction via Latent Corridors | Neerja Thakkar et.al. | 2312.06653v1 | null |
2023-12-11 | Nuvo: Neural UV Mapping for Unruly 3D Representations | Pratul P. Srinivasan et.al. | 2312.05283v1 | null |
2023-12-08 | KBFormer: A Diffusion Model for Structured Entity Completion | Ouail Kitouni et.al. | 2312.05253v1 | null |
2023-12-08 | Laboratory realization of relativistic pair-plasma beams | C. D. Arrowsmith et.al. | 2312.05244v1 | null |
2023-12-08 | Contra generative AI detection in higher education assessments | Cesare G. Ardito et.al. | 2312.05241v1 | null |
2023-12-08 | SwiftBrush: One-Step Text-to-Image Diffusion Model with Variational Score Distillation | Thuan Hoang Nguyen et.al. | 2312.05239v1 | null |
2023-12-08 | Seeing ChatGPT Through Universities’ Policies, Resources and Guidelines | Hui Wang et.al. | 2312.05235v1 | null |
2023-12-07 | Scaling Laws of Synthetic Images for Model Training … for Now | Lijie Fan et.al. | 2312.04567v1 | link |
2023-12-07 | Gen2Det: Generate to Detect | Saksham Suri et.al. | 2312.04566v1 | null |
2023-12-07 | MuRF: Multi-Baseline Radiance Fields | Haofei Xu et.al. | 2312.04565v1 | link |
2023-12-07 | GenDeF: Learning Generative Deformation Field for Video Generation | Wen Wang et.al. | 2312.04561v1 | null |
2023-12-07 | NeRFiller: Completing Scenes via Generative 3D Inpainting | Ethan Weber et.al. | 2312.04560v1 | null |
2023-12-07 | PrimDiffusion: Volumetric Primitives Diffusion for 3D Human Generation | Zhaoxi Chen et.al. | 2312.04559v1 | link |
2023-12-07 | GenTron: Delving Deep into Diffusion Transformers for Image and Video Generation | Shoufa Chen et.al. | 2312.04557v1 | null |
2023-12-07 | Large Language Models for Mathematicians | Simon Frieder et.al. | 2312.04556v1 | null |
2023-12-07 | Improved Visual Grounding through Self-Consistent Explanations | Ruozhen He et.al. | 2312.04554v1 | null |
2023-12-07 | Generating Illustrated Instructions | Sachit Menon et.al. | 2312.04552v1 | null |
2023-12-06 | Relightable Gaussian Codec Avatars | Shunsuke Saito et.al. | 2312.03704v1 | null |
2023-12-06 | Skeleton-in-Context: Unified Skeleton Sequence Modeling with In-Context Learning | Xinshun Wang et.al. | 2312.03703v1 | link |
2023-12-06 | Self-conditioned Image Generation via Generating Representations | Tianhong Li et.al. | 2312.03701v1 | link |
2023-12-06 | Intrinsic Harmonization for Illumination-Aware Compositing | Chris Careaga et.al. | 2312.03698v1 | link |
2023-12-06 | Efficient Learning in Polyhedral Games via Best Response Oracles | Darshan Chakrabarti et.al. | 2312.03696v1 | null |
2023-12-06 | Memory Triggers: Unveiling Memorization in Text-To-Image Generative Models through Word-Level Duplication | Ali Naseh et.al. | 2312.03692v1 | null |
2023-12-06 | On the Role of Edge Dependency in Graph Generative Models | Sudhanshu Chanpuriya et.al. | 2312.03691v1 | null |
2023-12-06 | Evaluating and Mitigating Discrimination in Language Model Decisions | Alex Tamkin et.al. | 2312.03689v1 | null |
2023-12-05 | GPT4Point: A Unified Framework for Point-Language Understanding and Generation | Zhangyang Qi et.al. | 2312.02980v1 | null |
2023-12-05 | Imitating Shortest Paths in Simulation Enables Effective Navigation and Manipulation in the Real World | Kiana Ehsani et.al. | 2312.02976v1 | null |
2023-12-05 | Describing Differences in Image Sets with Natural Language | Lisa Dunlap et.al. | 2312.02974v1 | link |
2023-12-05 | Alchemist: Parametric Control of Material Properties with Diffusion Models | Prafull Sharma et.al. | 2312.02970v1 | null |
2023-12-05 | Rank-without-GPT: Building GPT-Independent Listwise Rerankers on Open-Source Large Language Models | Xinyu Zhang et.al. | 2312.02969v1 | null |
2023-12-05 | AmbiGen: Generating Ambigrams from Pre-trained Diffusion Model | Boheng Zhao et.al. | 2312.02967v1 | null |
2023-12-05 | Diffusion-SS3D: Diffusion Model for Semi-supervised 3D Object Detection | Cheng-Ju Ho et.al. | 2312.02966v1 | link |
2023-12-05 | MVHumanNet: A Large-scale Dataset of Multi-view Daily Dressing Human Captures | Zhangyang Xiong et.al. | 2312.02963v1 | null |
2023-12-04 | Aligning and Prompting Everything All at Once for Universal Visual Perception | Yunhang Shen et.al. | 2312.02153v1 | link |
2023-12-04 | Readout Guidance: Learning Control from Diffusion Features | Grace Luo et.al. | 2312.02150v1 | null |
2023-12-04 | Generative Powers of Ten | Xiaojuan Wang et.al. | 2312.02149v1 | null |
2023-12-04 | Rejuvenating image-GPT as Strong Visual Representation Learners | Sucheng Ren et.al. | 2312.02147v1 | link |
2023-12-04 | Repurposing Diffusion-Based Image Generators for Monocular Depth Estimation | Bingxin Ke et.al. | 2312.02145v1 | link |
2023-12-04 | Optimizing Camera Configurations for Multi-View Pedestrian Detection | Yunzhong Hou et.al. | 2312.02144v1 | null |
2023-12-04 | Competition-Level Problems Are Effective Evaluators of LLMs | Yiming Huang et.al. | 2312.02143v1 | null |
2023-12-04 | Object Recognition as Next Token Prediction | Kaiyu Yue et.al. | 2312.02142v1 | link |
2023-12-01 | VideoBooth: Diffusion-based Video Generation with Image Prompts | Yuming Jiang et.al. | 2312.00777v1 | null |
2023-12-01 | Towards Generalizable Zero-Shot Manipulation via Translating Human Interaction Plans | Homanga Bharadhwaj et.al. | 2312.00775v1 | null |
2023-12-01 | Beyond ChatBots: ExploreLLM for Structured Thoughts and Personalized Model Responses | Xiao Ma et.al. | 2312.00763v1 | null |
2023-12-01 | Mamba: Linear-Time Sequence Modeling with Selective State Spaces | Albert Gu et.al. | 2312.00752v1 | link |
2023-12-01 | Reduction from sparse LPN to LPN, Dual Attack 3.0 | Kévin Carrier et.al. | 2312.00747v1 | null |
2023-12-01 | Adversarial Score Distillation: When score distillation meets GAN | Min Wei et.al. | 2312.00739v1 | link |
2023-11-30 | Dataset Distillation in Large Data Era | Zeyuan Yin et.al. | 2311.18838v1 | link |
2023-11-30 | VIDiff: Translating Videos via Multi-Modal Instructions with Diffusion Models | Zhen Xing et.al. | 2311.18837v1 | null |
2023-11-30 | PoseGPT: Chatting about 3D Human Pose | Yao Feng et.al. | 2311.18836v1 | null |
2023-11-30 | InstructSeq: Unifying Vision Tasks with Instruction-conditioned Multi-modal Sequence Generation | Rongyao Fang et.al. | 2311.18835v1 | link |
2023-11-30 | ART $\boldsymbol{\cdot}$ V: Auto-Regressive Text-to-Video Generation with Diffusion Models | Wenming Weng et.al. | 2311.18834v1 | null |
2023-11-30 | Exploiting Diffusion Prior for Generalizable Pixel-Level Semantic Prediction | Hsin-Ying Lee et.al. | 2311.18832v1 | link |
2023-11-30 | MotionEditor: Editing Video Motion via Content-Aware Diffusion | Shuyuan Tu et.al. | 2311.18830v1 | link |
2023-11-30 | MicroCinema: A Divide-and-Conquer Approach for Text-to-Video Generation | Yanhui Wang et.al. | 2311.18829v1 | null |
2023-11-30 | One-step Diffusion with Distribution Matching Distillation | Tianwei Yin et.al. | 2311.18828v1 | null |
2023-11-30 | An Adaptive Framework for Generalizing Network Traffic Prediction towards Uncertain Environments | Alexander Downey et.al. | 2311.18824v1 | null |
2023-11-29 | A Simple Recipe for Language-guided Domain Generalized Segmentation | Mohammad Fahes et.al. | 2311.17922v1 | null |
2023-11-29 | Do text-free diffusion models learn discriminative visual representations? | Soumik Mukhopadhyay et.al. | 2311.17921v1 | link |
2023-11-29 | Visual Anagrams: Generating Multi-View Optical Illusions with Diffusion Models | Daniel Geng et.al. | 2311.17919v1 | null |
2023-11-29 | Driving into the Future: Multiview Visual Forecasting and Planning with World Model for Autonomous Driving | Yuqi Wang et.al. | 2311.17918v1 | link |
2023-11-29 | AvatarStudio: High-fidelity and Animatable 3D Avatar Creation from Text | Jianfeng Zhang et.al. | 2311.17917v1 | null |
2023-11-29 | OPERA: Alleviating Hallucination in Multi-Modal Large Language Models via Over-Trust Penalty and Retrospection-Allocation | Qidong Huang et.al. | 2311.17911v1 | link |
2023-11-29 | HUGS: Human Gaussian Splats | Muhammed Kocabas et.al. | 2311.17910v1 | null |
2023-11-29 | CG3D: Compositional Generation for Text-to-3D via Gaussian Splatting | Alexander Vilesov et.al. | 2311.17907v1 | null |
2023-11-28 | HumanGaussian: Text-Driven 3D Human Generation with Gaussian Splatting | Xian Liu et.al. | 2311.17061v1 | null |
2023-11-28 | Material Palette: Extraction of Materials from a Single Image | Ivan Lopes et.al. | 2311.17060v1 | null |
2023-11-28 | Panoptic Video Scene Graph Generation | Jingkang Yang et.al. | 2311.17058v1 | link |
2023-11-28 | ReMoS: Reactive 3D Motion Synthesis for Two-Person Interactions | Anindita Ghosh et.al. | 2311.17057v1 | null |
2023-11-28 | Self-Supervised Motion Magnification by Backpropagating Through Optical Flow | Zhaoying Pan et.al. | 2311.17056v1 | null |
2023-11-28 | No Representation Rules Them All in Category Discovery | Sagar Vaze et.al. | 2311.17055v1 | null |
2023-11-28 | DiffuseBot: Breeding Soft Robots With Physics-Augmented Generative Diffusion Models | Tsun-Hsuan Wang et.al. | 2311.17053v1 | null |
2023-11-28 | Surf-D: High-Quality Surface Generation for Arbitrary Topologies using Diffusion Models | Zhengming Yu et.al. | 2311.17050v1 | null |
2023-11-27 | Video-Bench: A Comprehensive Benchmark and Toolkit for Evaluating Video-based Large Language Models | Munan Ning et.al. | 2311.16103v1 | link |
2023-11-27 | Test-time Adaptation of Discriminative Models via Diffusion Generative Feedback | Mihir Prabhudesai et.al. | 2311.16102v1 | null |
2023-11-27 | How Many Unicorns Are in This Image? A Safety Evaluation Benchmark for Vision LLMs | Haoqin Tu et.al. | 2311.16101v1 | link |
2023-11-27 | GART: Gaussian Articulated Template Models | Jiahui Lei et.al. | 2311.16099v1 | null |
2023-11-27 | On Bringing Robots Home | Nur Muhammad Mahi Shafiullah et.al. | 2311.16098v1 | link |
2023-11-27 | CG-HOI: Contact-Guided 3D Human-Object Interaction Generation | Christian Diller et.al. | 2311.16097v1 | null |
2023-11-27 | Animatable Gaussians: Learning Pose-dependent Gaussian Maps for High-fidelity Human Avatar Modeling | Zhe Li et.al. | 2311.16096v1 | link |
2023-11-27 | Self-correcting LLM-controlled Diffusion Models | Tsung-Han Wu et.al. | 2311.16090v1 | null |
2023-11-27 | DUnE: Dataset for Unified Editing | Afra Feyza Akyürek et.al. | 2311.16087v1 | link |
2023-11-24 | SEGIC: Unleashing the Emergent Correspondence for In-Context Segmentation | Lingchen Meng et.al. | 2311.14671v1 | link |
2023-11-24 | Data-driven Prior Learning for Bayesian Optimisation | Sigrid Passano Hellan et.al. | 2311.14653v1 | link |
2023-11-24 | One Pass Streaming Algorithm for Super Long Token Attention Approximation in Sublinear Space | Raghav Addanki et.al. | 2311.14652v1 | null |
2023-11-24 | History Filtering in Imperfect Information Games: Algorithms and Complexity | Christopher Solinas et.al. | 2311.14651v1 | null |
2023-11-22 | Retrieval-Augmented Layout Transformer for Content-Aware Layout Generation | Daichi Horita et.al. | 2311.13602v1 | null |
2023-11-22 | Visual In-Context Prompting | Feng Li et.al. | 2311.13601v1 | link |
2023-11-22 | ZipLoRA: Any Subject in Any Style by Effectively Merging LoRAs | Viraj Shah et.al. | 2311.13600v1 | null |
2023-11-22 | Risk-sensitive Markov Decision Process and Learning under General Utility Functions | Zhengqi Wu et.al. | 2311.13589v1 | null |
2023-11-22 | A Survey of Serverless Machine Learning Model Inference | Kamil Kojs et.al. | 2311.13587v1 | null |
2023-11-22 | On diffusion-based generative models and their error bounds: The log-concave case with full convergence estimates | Stefano Bruno et.al. | 2311.13584v1 | null |
2023-11-22 | PaSS: Parallel Speculative Sampling | Giovanni Monea et.al. | 2311.13581v1 | null |
2023-11-22 | Aufbau Suppressed Coupled Cluster Theory for Electronically Excited States | Harrison Tuckman et.al. | 2311.13576v1 | null |
2023-11-21 | Intrinsic Image Decomposition via Ordinal Shading | Chris Careaga et.al. | 2311.12792v1 | link |
2023-11-21 | Mechanistically analyzing the effects of fine-tuning on procedurally defined tasks | Samyak Jain et.al. | 2311.12786v1 | null |
2023-11-20 | Rate-Independent Gradient Crystal Plasticity Theory – Robust Algorithmic Formulations based on Incremental Energy Minimization | Volker Fohrmeister et.al. | 2311.12026v1 | null |
2023-11-20 | The allosteric lever: towards a principle of specific allosteric response | Maximilian Vossel et.al. | 2311.12025v1 | null |
2023-11-20 | PF-LRM: Pose-Free Large Reconstruction Model for Joint Pose and Shape Prediction | Peng Wang et.al. | 2311.12024v1 | null |
2023-11-20 | Macroscopic description of a heavy particle immersed within a flow of light particles | Radek Erban et.al. | 2311.12021v1 | null |
2023-11-20 | An Empirical Study of Self-Admitted Technical Debt in Machine Learning Software | Aaditya Bhatia et.al. | 2311.12019v1 | null |
2023-11-20 | GPT-4V(ision) for Robotics: Multimodal Task Planning from Human Demonstration | Naoki Wake et.al. | 2311.12015v1 | null |
2023-11-17 | Emu Video: Factorizing Text-to-Video Generation by Explicit Image Conditioning | Rohit Girdhar et.al. | 2311.10709v1 | null |
2023-11-17 | SelfEval: Leveraging the discriminative nature of generative models for evaluation | Sai Saketh Rambhatla et.al. | 2311.10708v1 | null |
2023-11-17 | Cactus Representations in Polylogarithmic Max-flow via Maximal Isolating Mincuts | Zhongtian He et.al. | 2311.10706v1 | null |
2023-11-16 | The Chosen One: Consistent Characters in Text-to-Image Diffusion Models | Omri Avrahami et.al. | 2311.10093v1 | null |
2023-11-16 | Traffic Video Object Detection using Motion Prior | Lihao Liu et.al. | 2311.10092v1 | null |
2023-11-16 | Adaptive Shells for Efficient Neural Radiance Field Rendering | Zian Wang et.al. | 2311.10091v1 | null |
2023-11-16 | Emu Edit: Precise Image Editing via Recognition and Generation Tasks | Shelly Sheynin et.al. | 2311.10089v1 | null |
2023-11-16 | DRESS: Instructing Large Vision-Language Models to Align and Interact with Humans via Natural Language Feedback | Yangyi Chen et.al. | 2311.10081v1 | null |
2023-11-16 | Improving 3D Synthetic Jet Modeling in a Crossflow | Howard Ho et.al. | 2311.10072v1 | null |
2023-11-15 | Single-Image 3D Human Digitization with Shape-Guided Diffusion | Badour AlBahar et.al. | 2311.09221v1 | null |
2023-11-15 | DMV3D: Denoising Multi-View Diffusion using 3D Large Reconstruction Model | Yinghao Xu et.al. | 2311.09217v1 | null |
2023-11-15 | Assessing Translation capabilities of Large Language Models involving English and Indian Languages | Vandan Mujadia et.al. | 2311.09216v1 | null |
2023-11-15 | GRIM: GRaph-based Interactive narrative visualization for gaMes | Jorge Leandro et.al. | 2311.09213v1 | null |
2023-11-15 | Controllable Text Summarization: Unraveling Challenges, Approaches, and Prospects – A Survey | Ashok Urlana et.al. | 2311.09212v1 | link |
2023-11-15 | Chain-of-Note: Enhancing Robustness in Retrieval-Augmented Language Models | Wenhao Yu et.al. | 2311.09210v1 | null |
2023-11-15 | A Unified Approach to Learning Ising Models: Beyond Independence and Bounded Width | Jason Gaitonde et.al. | 2311.09197v1 | null |
2023-11-15 | Self-Supervised Curriculum Generation for Autonomous Reinforcement Learning without Task-Specific Knowledge | Sang-Hyun Lee et.al. | 2311.09195v1 | null |
2023-11-15 | Structural Priming Demonstrates Abstract Grammatical Representations in Multilingual Language Models | James A. Michaelov et.al. | 2311.09194v1 | null |
2023-11-14 | Instant3D: Instant Text-to-3D Generation | Ming Li et.al. | 2311.08403v1 | null |
2023-11-14 | Fine-tuning Language Models for Factuality | Katherine Tian et.al. | 2311.08401v1 | null |
2023-11-14 | Towards Open-Ended Visual Recognition with Large Language Model | Qihang Yu et.al. | 2311.08400v1 | link |
2023-11-14 | Are Large Language Models Temporally Grounded? | Yifu Qiu et.al. | 2311.08398v1 | link |
2023-11-14 | MVSA-Net: Multi-View State-Action Recognition for Robust and Deployable Trajectory Generation | Ehsan Asali et.al. | 2311.08393v1 | null |
2023-11-14 | On What Basis? Predicting Text Preference Via Structured Comparative Reasoning | Jing Nathan Yan et.al. | 2311.08390v1 | null |
2023-11-14 | TSST: A Benchmark and Evaluation Models for Text Speech-Style Transfer | Huashan Sun et.al. | 2311.08389v1 | null |
2023-11-13 | To See is to Believe: Prompting GPT-4V for Better Visual Instruction Tuning | Junke Wang et.al. | 2311.07574v1 | link |
2023-11-13 | Realizability of Free Spaces of Curves | Hugo A. Akitaya et.al. | 2311.07573v1 | null |
2023-11-13 | Feature emergence via margin maximization: case studies in algebraic tasks | Depen Morwani et.al. | 2311.07568v1 | null |
2023-11-13 | GPT-4V in Wonderland: Large Multimodal Models for Zero-Shot Smartphone GUI Navigation | An Yan et.al. | 2311.07562v1 | link |
2023-11-13 | Fast Normalized Cross-Correlation for Template Matching with Rotations | José María Almira et.al. | 2311.07561v1 | null |
2023-11-13 | Sound Gradual Verification with Symbolic Execution | Conrad Zimmerman et.al. | 2311.07559v1 | null |
2023-11-13 | Data-Efficient Task Generalization via Probabilistic Model-based Meta Reinforcement Learning | Arjun Bhardwaj et.al. | 2311.07558v1 | null |
2023-11-10 | Parameter-Efficient Orthogonal Finetuning via Butterfly Factorization | Weiyang Liu et.al. | 2311.06243v1 | null |
2023-11-10 | Florence-2: Advancing a Unified Representation for a Variety of Vision Tasks | Bin Xiao et.al. | 2311.06242v1 | null |
2023-11-10 | Nonnegativity Problems for Matrix Semigroups | Julian D’Costa et.al. | 2311.06241v1 | null |
2023-11-10 | Summon a Demon and Bind it: A Grounded Theory of LLM Red Teaming in the Wild | Nanna Inie et.al. | 2311.06237v1 | null |
2023-11-10 | Deep Learning meets Blockchain for Automated and Secure Access Control | Asma Jodeiri Akbarfam et.al. | 2311.06236v1 | null |
2023-11-10 | Learning material synthesis-structure-property relationship by data fusion: Bayesian Co-regionalization N-Dimensional Piecewise Function Learning | A. Gilad Kusne et.al. | 2311.06228v1 | null |
2023-11-10 | Does Differential Privacy Prevent Backdoor Attacks in Practice? | Fereshteh Razmi et.al. | 2311.06227v1 | null |
2023-11-09 | What Do I Hear? Generating Sounds for Visuals with ChatGPT | David Chuan-En Lin et.al. | 2311.05609v1 | null |
2023-11-09 | Real-Time Neural Rasterization for Large Scenes | Jeffrey Yunfan Liu et.al. | 2311.05607v1 | null |
2023-11-09 | Diffusion-Generative Multi-Fidelity Learning for Physical Simulation | Zheng Wang et.al. | 2311.05606v1 | null |
2023-11-09 | 3D-QAE: Fully Quantum Auto-Encoding of 3D Point Clouds | Lakshika Rathi et.al. | 2311.05604v1 | null |
2023-11-09 | Reconstructing Objects in-the-wild for Realistic Sensor Simulation | Ze Yang et.al. | 2311.05602v1 | null |
2023-11-09 | SynH2R: Synthesizing Hand-Object Motions for Learning Human-to-Robot Handovers | Sammy Christen et.al. | 2311.05599v1 | null |
2023-11-09 | LLM Augmented Hierarchical Agents | Bharat Prakash et.al. | 2311.05596v1 | null |
2023-11-08 | GENOME: GenerativE Neuro-symbOlic visual reasoning by growing and reusing ModulEs | Zhenfang Chen et.al. | 2311.04901v1 | null |
2023-11-08 | How Abstract Is Linguistic Generalization in Large Language Models? Experiments with Argument Structure | Michael Wilson et.al. | 2311.04900v1 | link |
2023-11-08 | Optimized measurements of chaotic dynamical systems via the information bottleneck | Kieran A. Murphy et.al. | 2311.04896v1 | null |
2023-11-08 | The Monadic Theory of Toric Words | Valérie Berthé et.al. | 2311.04895v1 | null |
2023-11-08 | Bias Runs Deep: Implicit Reasoning Biases in Persona-Assigned LLMs | Shashank Gupta et.al. | 2311.04892v1 | link |
2023-11-08 | AutoChip: Automating HDL Generation Using LLM Feedback | Shailja Thakur et.al. | 2311.04887v1 | link |
2023-11-08 | SEMQA: Semi-Extractive Multi-Source Question Answering | Tal Schuster et.al. | 2311.04886v1 | link |
2023-11-07 | Towards Garment Sewing Pattern Reconstruction from a Single Image | Lijuan Liu et.al. | 2311.04218v1 | link |
2023-11-07 | Rephrase and Respond: Let Large Language Models Ask Better Questions for Themselves | Yihe Deng et.al. | 2311.04205v1 | link |
2023-11-07 | Sharp Thresholds Imply Circuit Lower Bounds: from random 2-SAT to Planted Clique | David Gamarnik et.al. | 2311.04204v1 | null |
2023-11-07 | Exploring Recommendation Capabilities of GPT-4V(ision): A Preliminary Case Study | Peilin Zhou et.al. | 2311.04199v1 | null |
2023-11-07 | JPAVE: A Generation and Classification-based Model for Joint Product Attribute Prediction and Value Extraction | Zhongfen Deng et.al. | 2311.04196v1 | link |
2023-11-06 | GLaMM: Pixel Grounding Large Multimodal Model | Hanoona Rasheed et.al. | 2311.03356v1 | null |
2023-11-06 | SegGen: Supercharging Segmentation Models with Text2Mask and Mask2Img Synthesis | Hanrong Ye et.al. | 2311.03355v1 | null |
2023-11-06 | CoVLM: Composing Visual Entities and Relationships in Large Language Models Via Communicative Decoding | Junyan Li et.al. | 2311.03354v1 | null |
2023-11-06 | Scalable and Transferable Black-Box Jailbreaks for Language Models via Persona Modulation | Rusheb Shah et.al. | 2311.03348v1 | null |
2023-11-06 | Decomposing Probability Marginals Beyond Affine Requirements | Jannik Matuschke et.al. | 2311.03346v1 | null |
2023-11-06 | Long-Term Invariant Local Features via Implicit Cross-Domain Correspondences | Zador Pataki et.al. | 2311.03345v1 | null |
2023-11-06 | Embedding First Order Logic into Kernel Machines | Michelangelo Diligenti et.al. | 2311.03340v1 | null |
2023-11-03 | EmerNeRF: Emergent Spatial-Temporal Scene Decomposition via Self-Supervision | Jiawei Yang et.al. | 2311.02077v1 | null |
2023-11-03 | Universal Sharpness Dynamics in Neural Network Training: Fixed Point Analysis, Edge of Stability, and Route to Chaos | Dayal Singh Kalra et.al. | 2311.02076v1 | null |
2023-11-03 | Envy-Free Cake-Cutting for Four Agents | Alexandros Hollender et.al. | 2311.02075v1 | null |
2023-11-03 | Learning Historical Status Prompt for Accurate and Robust Visual Tracking | Wenrui Cai et.al. | 2311.02072v1 | null |
2023-11-03 | Grounded Intuition of GPT-Vision’s Abilities with Scientific Images | Alyssa Hwang et.al. | 2311.02069v1 | link |
2023-11-03 | GroomGen: A High-Quality Generative Hair Model Using Hierarchical Latent Representations | Yuxiao Zhou et.al. | 2311.02062v1 | null |
2023-11-03 | Active Learning-Based Species Range Estimation | Christian Lange et.al. | 2311.02061v1 | link |
2023-11-02 | Idempotent Generative Network | Assaf Shocher et.al. | 2311.01462v1 | null |
2023-11-02 | Align Your Prompts: Test-Time Prompting with Distribution Alignment for Zero-Shot Generalization | Jameel Hassan et.al. | 2311.01459v1 | null |
2023-11-02 | Detecting Deepfakes Without Seeing Any | Tal Reiss et.al. | 2311.01458v1 | link |
2023-11-02 | RoboGen: Towards Unleashing Infinite Data for Automated Robot Learning via Generative Simulation | Yufei Wang et.al. | 2311.01455v1 | null |
2023-11-02 | NOIR: Neural Signal Operated Intelligent Robots for Everyday Activities | Ruohan Zhang et.al. | 2311.01454v1 | null |
2023-11-02 | DreamSmooth: Improving Model-based Reinforcement Learning via Reward Smoothing | Vint Lee et.al. | 2311.01450v1 | null |
2023-11-02 | UltraLiDAR: Learning Compact Representations for LiDAR Completion and Generation | Yuwen Xiong et.al. | 2311.01448v1 | null |
2023-11-02 | CADSim: Robust and Scalable in-the-wild 3D Reconstruction for Controllable Sensor Simulation | Jingkang Wang et.al. | 2311.01447v1 | null |
2023-11-02 | Adv3D: Generating Safety-Critical 3D Objects through Closed-Loop Simulation | Jay Sarva et.al. | 2311.01446v1 | null |
2023-11-01 | End-to-End Single-Channel Speaker-Turn Aware Conversational Speech Translation | Juan Zuluaga-Gomez et.al. | 2311.00697v1 | link |
2023-11-01 | Unleashing the Creative Mind: Language Model As Hierarchical Policy For Improved Exploration on Challenging Problem Solving | Zhan Ling et.al. | 2311.00694v1 | link |
2023-11-01 | Improving Interpersonal Communication by Simulating Audiences with Language Models | Ryan Liu et.al. | 2311.00687v1 | link |
2023-11-01 | Deep Learning-Based Classification of Gamma Photon Interactions in Room-Temperature Semiconductor Radiation Detectors | Sandeep K. Chaudhuri et.al. | 2311.00682v1 | null |
2023-11-01 | Are Large Language Models Reliable Judges? A Study on the Factuality Evaluation Capabilities of LLMs | Xue-Yong Fu et.al. | 2311.00681v1 | null |
2023-10-31 | Unexpected Improvements to Expected Improvement for Bayesian Optimization | Sebastian Ament et.al. | 2310.20708v1 | null |
2023-10-31 | What’s In My Big Data? | Yanai Elazar et.al. | 2310.20707v1 | link |
2023-10-31 | DDAM-PS: Diligent Domain Adaptive Mixer for Person Search | Mohammed Khaleed Almansoori et.al. | 2310.20706v1 | link |
2023-10-31 | SEINE: Short-to-Long Video Diffusion Model for Generative Transition and Prediction | Xinyuan Chen et.al. | 2310.20700v1 | null |
2023-11-01 | Bayesian Multistate Bennett Acceptance Ratio Methods | Xinqiang Ding et.al. | 2310.20699v2 | link |
2023-10-31 | Learning From Mistakes Makes LLM Better Reasoner | Shengnan An et.al. | 2310.20689v1 | link |
2023-10-31 | Compression with Exact Error Distribution for Federated Learning | Mahmoud Hegazy et.al. | 2310.20682v1 | null |
2023-10-30 | Variational principles for the hydrodynamics of the classical one-component plasma | Daniels Krimans et.al. | 2310.19239v1 | null |
2023-10-30 | Building Real-World Meeting Summarization Systems using Large Language Models: A Practical Perspective | Md Tahmid Rahman Laskar et.al. | 2310.19233v1 | null |
2023-10-30 | Stochastic Configuration Machines: FPGA Implementation | Matthew J. Felicetti et.al. | 2310.19225v1 | null |
2023-10-30 | CHAMMI: A benchmark for channel-adaptive models in microscopy imaging | Zitong Chen et.al. | 2310.19224v1 | link |
2023-10-27 | FP8-LM: Training FP8 Large Language Models | Houwen Peng et.al. | 2310.18313v1 | link |
2023-10-27 | Gen2Sim: Scaling up Robot Learning in Simulation with Generative Models | Pushkal Katara et.al. | 2310.18308v1 | null |
2023-10-27 | Interactive Motion Planning for Autonomous Vehicles with Joint Optimization | Yuxiao Chen et.al. | 2310.18301v1 | null |
2023-10-27 | Enhancing the Performance of a Biomimetic Robotic Elbow-and-Forearm System Through Bionics-Inspired Optimization | Haosen Yang et.al. | 2310.18299v1 | null |
2023-10-27 | Sharp-Edge Diffraction of Laguerre-Gauss Vortex Beams by Elliptic Apertures | Riccardo Borghi et.al. | 2310.18298v1 | null |
2023-10-27 | Addressing GAN Training Instabilities via Tunable Classification Losses | Monica Welfert et.al. | 2310.18291v1 | null |
2023-10-26 | Fantastic Gains and Where to Find Them: On the Existence and Prospect of General Knowledge Transfer between Any Pretrained Model | Karsten Roth et.al. | 2310.17653v1 | link |
2023-10-26 | A Coarse-to-Fine Pseudo-Labeling (C2FPL) Framework for Unsupervised Video Anomaly Detection | Anas Al-lahham et.al. | 2310.17650v1 | link |
2023-10-26 | 6-DoF Stability Field via Diffusion Models | Takuma Yoneda et.al. | 2310.17649v1 | null |
2023-10-26 | In-Context Learning Dynamics with Random Binary Sequences | Eric J. Bigelow et.al. | 2310.17639v1 | null |
2023-10-26 | Generative Fractional Diffusion Models | Gabriel Nobis et.al. | 2310.17638v1 | null |
2023-10-26 | JudgeLM: Fine-tuned Large Language Models are Scalable Judges | Lianghui Zhu et.al. | 2310.17631v1 | link |
2023-10-25 | SparseDFF: Sparse-View Feature Distillation for One-Shot Dexterous Manipulation | Qianxu Wang et.al. | 2310.16838v1 | null |
2023-10-25 | Proposal-Contrastive Pretraining for Object Detection from Fewer Data | Quentin Bouniot et.al. | 2310.16835v1 | null |
2023-10-25 | CommonCanvas: An Open Diffusion Model Trained with Creative-Commons Images | Aaron Gokaslan et.al. | 2310.16825v1 | link |
2023-10-26 | DreamCraft3D: Hierarchical 3D Generation with Bootstrapped Diffusion Prior | Jingxiang Sun et.al. | 2310.16818v2 | link |
2023-10-25 | The intelligent agent model – a fully two-dimensional microscopic traffic flow model | Martin Treiber et.al. | 2310.16816v1 | null |
2023-10-24 | Synthetic Data as Validation | Qixin Hu et.al. | 2310.16052v1 | null |
2023-10-24 | EquivAct: SIM(3)-Equivariant Visuomotor Policies beyond Rigid Object Manipulation | Jingyun Yang et.al. | 2310.16050v1 | null |
2023-10-24 | MuSR: Testing the Limits of Chain-of-thought with Multistep Soft Reasoning | Zayne Sprague et.al. | 2310.16049v1 | link |
2023-10-24 | From Posterior Sampling to Meaningful Diversity in Image Restoration | Noa Cohen et.al. | 2310.16047v1 | null |
2023-10-24 | Woodpecker: Hallucination Correction for Multimodal Large Language Models | Shukang Yin et.al. | 2310.16045v1 | link |
2023-10-25 | Stanford-ORB: A Real-World 3D Object Inverse Rendering Benchmark | Zhengfei Kuang et.al. | 2310.16044v2 | link |
2023-10-25 | WebWISE: Web Interface Control and Sequential Exploration with Large Language Models | Heyi Tao et.al. | 2310.16042v2 | null |
2023-10-24 | Instruct and Extract: Instruction Tuning for On-Demand Information Extraction | Yizhu Jiao et.al. | 2310.16040v1 | link |
2023-10-23 | FreeNoise: Tuning-Free Longer Video Diffusion Via Noise Rescheduling | Haonan Qiu et.al. | 2310.15169v1 | link |
2023-10-24 | Ghost on the Shell: An Expressive Representation of General 3D Shapes | Zhen Liu et.al. | 2310.15168v2 | null |
2023-10-23 | SAM-Med3D | Haoyu Wang et.al. | 2310.15161v1 | link |
2023-10-23 | FreeMask: Synthetic Images with Dense Annotations Make Stronger Segmentation Models | Lihe Yang et.al. | 2310.15160v1 | link |
2023-10-23 | Online Detection of AI-Generated Images | David C. Epstein et.al. | 2310.15150v1 | null |
2023-10-23 | DEsignBench: Exploring and Benchmarking DALL-E 3 for Imagining Visual Design | Kevin Lin et.al. | 2310.15144v1 | link |
2023-10-23 | SpecTr: Fast Speculative Decoding via Optimal Transport | Ziteng Sun et.al. | 2310.15141v1 | null |
2023-10-20 | Neural-Base Music Generation for Intelligence Duplication | Jacob Galajda et.al. | 2310.13691v1 | null |
2023-10-20 | Exploring Linguistic Probes for Morphological Generalization | Jordan Kodner et.al. | 2310.13686v1 | null |
2023-10-20 | CAPIVARA: Cost-Efficient Approach for Improving Multilingual CLIP Performance on Low-Resource Languages | Gabriel Oliveira dos Santos et.al. | 2310.13683v1 | link |
2023-10-20 | Optimizing Retrieval-augmented Reader Models via Token Elimination | Moshe Berchansky et.al. | 2310.13682v1 | link |
2023-10-20 | Information Value: Measuring Utterance Predictability as Distance from Plausible Alternatives | Mario Giulianelli et.al. | 2310.13676v1 | link |
2023-10-20 | On Synthetic Data for Back Translation | Jiahao Xu et.al. | 2310.13675v1 | link |
2023-10-19 | HumanTOMATO: Text-aligned Whole-body Motion Generation | Shunlin Lu et.al. | 2310.12978v1 | null |
2023-10-19 | Training Dynamics of Deep Network Linear Regions | Ahmed Imtiaz Humayun et.al. | 2310.12977v1 | null |
2023-10-19 | Frozen Transformers in Language Models Are Effective Visual Encoder Layers | Ziqi Pang et.al. | 2310.12973v1 | link |
2023-10-19 | CCIL: Continuity-based Data Augmentation for Corrective Imitation Learning | Liyiming Ke et.al. | 2310.12972v1 | null |
2023-10-19 | CLAIR: Evaluating Image Captions with Large Language Models | David Chan et.al. | 2310.12971v1 | null |
2023-10-19 | Does Your Model Think Like an Engineer? Explainable AI for Bearing Fault Detection with Deep Learning | Thomas Decker et.al. | 2310.12967v1 | null |
2023-10-18 | Understanding Retrieval Augmentation for Long-Form Question Answering | Hung-Ting Chen et.al. | 2310.12150v1 | null |
2023-10-18 | Object-aware Inversion and Reassembly for Image Editing | Zhen Yang et.al. | 2310.12149v1 | null |
2023-10-18 | Simple Mechanisms for Representing, Indexing and Manipulating Concepts | Yuanzhi Li et.al. | 2310.12143v1 | null |
2023-10-17 | DELIFFAS: Deformable Light Fields for Fast Avatar Synthesis | Youngjoong Kwon et.al. | 2310.11449v1 | null |
2023-10-17 | Functional Invariants to Watermark Large Transformers | Fernandez Pierre et.al. | 2310.11446v1 | null |
2023-10-18 | EvalCrafter: Benchmarking and Evaluating Large Video Generation Models | Yaofang Liu et.al. | 2310.11440v2 | link |
2023-10-17 | Sadness, Anger, or Anxiety: Twitter Users’ Emotional Responses to Toxicity in Public Conversations | Ana Aleksandric et.al. | 2310.11436v1 | null |
2023-10-17 | An Empirical Study of Translation Hypothesis Ensembling with Large Language Models | António Farinhas et.al. | 2310.11430v1 | link |
2023-10-17 | Butterfly Effects of SGD Noise: Error Amplification in Behavior Cloning and Autoregression | Adam Block et.al. | 2310.11428v1 | null |
2023-10-17 | A Computational Framework for Solving Wasserstein Lagrangian Flows | Kirill Neklyudov et.al. | 2310.10649v2 | link |
2023-10-16 | Step-by-Step Remediation of Students’ Mathematical Mistakes | Rose E. Wang et.al. | 2310.10648v1 | link |
2023-10-16 | A Survey on Video Diffusion Models | Zhen Xing et.al. | 2310.10647v1 | link |
2023-10-16 | Interactive Task Planning with Language Models | Boyi Li et.al. | 2310.10645v1 | null |
2023-10-16 | TOSS:High-quality Text-guided Novel View Synthesis from a Single Image | Yukai Shi et.al. | 2310.10644v1 | null |
2023-10-16 | Real-time Photorealistic Dynamic Scene Representation and Rendering with 4D Gaussian Splatting | Zeyu Yang et.al. | 2310.10642v1 | link |
2023-10-16 | LLM Blueprint: Enabling Text-to-Image Generation with Complex and Detailed Prompts | Hanan Gani et.al. | 2310.10640v1 | link |
2023-10-16 | Zero-Shot Robotic Manipulation with Pretrained Image-Editing Diffusion Models | Kevin Black et.al. | 2310.10639v1 | link |
2023-10-13 | Vision-by-Language for Training-Free Compositional Image Retrieval | Shyamgopal Karthik et.al. | 2310.09291v1 | link |
2023-10-13 | Disentangled Latent Spaces Facilitate Data-Driven Auxiliary Learning | Geri Skenderi et.al. | 2310.09278v1 | null |
2023-10-13 | Retro-fallback: retrosynthetic planning in an uncertain world | Austin Tripp et.al. | 2310.09270v1 | null |
2023-10-13 | Genetic algorithms are strong baselines for molecule generation | Austin Tripp et.al. | 2310.09267v1 | null |
2023-10-13 | Towards End-to-end 4-Bit Inference on Generative Large Language Models | Saleh Ashkboos et.al. | 2310.09259v1 | link |
2023-10-12 | Octopus: Embodied Vision-Language Programmer from Environmental Feedback | Jingkang Yang et.al. | 2310.08588v1 | link |
2023-10-12 | Is Generalized Dynamic Novel View Synthesis from Monocular Videos Possible Today? | Xiaoming Zhao et.al. | 2310.08587v1 | null |
2023-10-12 | PonderV2: Pave the Way for 3D Foundataion Model with A Universal Pre-training Paradigm | Haoyi Zhu et.al. | 2310.08586v1 | link |
2023-10-12 | Discovering Fatigued Movements for Virtual Character Animation | Noshaba Cheema et.al. | 2310.08583v1 | null |
2023-10-12 | Tree-Planner: Efficient Close-loop Task Planning with Large Language Models | Mengkang Hu et.al. | 2310.08582v1 | null |
2023-10-12 | Universal Visual Decomposer: Long-Horizon Manipulation Made Easy | Zichen Zhang et.al. | 2310.08581v1 | null |
2023-10-12 | OmniControl: Control Any Joint at Any Time for Human Motion Generation | Yiming Xie et.al. | 2310.08580v1 | link |
2023-10-12 | HyperHuman: Hyper-Realistic Human Generation with Latent Structural Diffusion | Xian Liu et.al. | 2310.08579v1 | null |
2023-10-12 | Learning to Act from Actionless Videos through Dense Correspondences | Po-Chen Ko et.al. | 2310.08576v1 | null |
2023-10-12 | Jigsaw: Supporting Designers in Prototyping Multimodal Applications by Assembling AI Foundation Models | David Chuan-En Lin et.al. | 2310.08574v1 | null |
2023-10-11 | InstructRetro: Instruction Tuning post Retrieval-Augmented Pretraining | Boxin Wang et.al. | 2310.07713v1 | link |
2023-10-11 | ScaleCrafter: Tuning-free Higher-Resolution Visual Generation with Diffusion Models | Yingqing He et.al. | 2310.07702v1 | link |
2023-10-11 | Knowledge-enhanced Memory Model for Emotional Support Conversation | Mengzhao Jia et.al. | 2310.07700v1 | null |
2023-10-11 | From Scarcity to Efficiency: Improving CLIP Training via Visual-enriched Captions | Zhengfeng Lai et.al. | 2310.07699v1 | link |
2023-10-11 | SurroCBM: Concept Bottleneck Surrogate Models for Generative Post-hoc Explanation | Bo Pan et.al. | 2310.07698v1 | null |
2023-10-11 | ConditionVideo: Training-Free Condition-Guided Text-to-Video Generation | Bo Peng et.al. | 2310.07697v1 | link |
2023-10-11 | Large-scale photonic computing with nonlinear disordered media | Hao Wang et.al. | 2310.07690v1 | null |
2023-10-10 | AutoAD II: The Sequel – Who, When, and What in Movie Audio Description | Tengda Han et.al. | 2310.06838v1 | null |
2023-10-10 | Generating and Evaluating Tests for K-12 Students with Language Model Simulations: A Case Study on Sentence Reading Efficiency | Eric Zelikman et.al. | 2310.06837v1 | null |
2023-10-10 | What Does Stable Diffusion Know about the 3D Scene? | Guanqi Zhan et.al. | 2310.06836v1 | link |
2023-10-10 | Teaching Language Models to Hallucinate Less with Synthetic Tasks | Erik Jones et.al. | 2310.06827v1 | null |
2023-10-10 | Mistral 7B | Albert Q. Jiang et.al. | 2310.06825v1 | link |
2023-10-10 | The Geometry of Truth: Emergent Linear Structure in Large Language Model Representations of True/False Datasets | Samuel Marks et.al. | 2310.06824v1 | link |
2023-10-09 | Grokking as Compression: A Nonlinear Complexity Perspective | Ziming Liu et.al. | 2310.05918v1 | null |
2023-10-09 | Drivable Avatar Clothing: Faithful Full-Body Telepresence with Dynamic Clothing Driven by Sparse RGB-D Input | Donglai Xiang et.al. | 2310.05917v1 | null |
2023-10-09 | FireAct: Toward Language Agent Fine-tuning | Baian Chen et.al. | 2310.05915v1 | null |
2023-10-09 | SALMON: Self-Alignment with Principle-Following Reward Models | Zhiqing Sun et.al. | 2310.05910v1 | link |
2023-10-09 | Lion Secretly Solves Constrained Optimization: As Lyapunov Predicts | Lizhang Chen et.al. | 2310.05898v1 | null |
2023-10-06 | BrainSCUBA: Fine-Grained Natural Language Captions of Visual Cortex Selectivity | Andrew F. Luo et.al. | 2310.04420v1 | null |
2023-10-06 | Functional Interpolation for Relative Positions Improves Long Context Transformers | Shanda Li et.al. | 2310.04418v1 | null |
2023-10-09 | CIFAR-10-Warehouse: Broad and More Realistic Testbeds in Model Generalization Analysis | Xiaoxiao Sun et.al. | 2310.04414v2 | null |
2023-10-06 | FedConv: Enhancing Convolutional Neural Networks for Handling Data Heterogeneity in Federated Learning | Peiran Xu et.al. | 2310.04412v1 | link |
2023-10-06 | RECOMP: Improving Retrieval-Augmented LMs with Compression and Selective Augmentation | Fangyuan Xu et.al. | 2310.04408v1 | link |
2023-10-06 | Policy-Gradient Training of Language Models for Ranking | Ge Gao et.al. | 2310.04407v1 | null |
2023-10-06 | Language Agent Tree Search Unifies Reasoning Acting and Planning in Language Models | Andy Zhou et.al. | 2310.04406v1 | link |
2023-10-05 | ContactGen: Generative Contact Modeling for Grasp Generation | Shaowei Liu et.al. | 2310.03740v1 | null |
2023-10-05 | Aligning Text-to-Image Diffusion Models with Reward Backpropagation | Mihir Prabhudesai et.al. | 2310.03739v1 | link |
2023-10-05 | Stylist: Style-Driven Feature Ranking for Robust Novelty Detection | Stefan Smeu et.al. | 2310.03738v1 | link |
2023-10-05 | Leveraging Unpaired Data for Vision-Language Generative Models via Cycle Consistency | Tianhong Li et.al. | 2310.03734v1 | null |
2023-10-05 | MathCoder: Seamless Code Integration in LLMs for Enhanced Mathematical Reasoning | Ke Wang et.al. | 2310.03731v1 | link |
2023-10-05 | Stochastic interpolants with data-dependent couplings | Michael S. Albergo et.al. | 2310.03725v1 | null |
2023-10-04 | LanguageMPC: Large Language Models as Decision Makers for Autonomous Driving | Hao Sha et.al. | 2310.03026v1 | null |
2023-10-04 | Retrieval meets Long Context Large Language Models | Peng Xu et.al. | 2310.03025v1 | null |
2023-10-04 | Decision ConvFormer: Local Filtering in MetaFormer is Sufficient for Decision Making | Jeonghye Kim et.al. | 2310.03022v1 | null |
2023-10-04 | Consistent-1-to-3: Consistent Image to 3D View Synthesis via Geometry-aware Diffusion Models | Jianglong Ye et.al. | 2310.03020v1 | null |
2023-10-04 | Multimodal Question Answering for Unified Information Extraction | Yuxuan Sun et.al. | 2310.03017v1 | link |
2023-10-04 | Efficient-3DiM: Learning a Generalizable Single-image Novel-view Synthesizer in One Day | Yifan Jiang et.al. | 2310.03015v1 | null |
2023-10-04 | SemiReward: A General Reward Model for Semi-supervised Learning | Siyuan Li et.al. | 2310.03013v1 | link |
2023-10-04 | Towards Domain-Specific Features Disentanglement for Domain Generalization | Hao Chen et.al. | 2310.03007v1 | null |
2023-10-05 | COOLer: Class-Incremental Learning for Appearance-Based Multiple Object Tracking | Zhizheng Liu et.al. | 2310.03006v2 | link |
2023-10-03 | Generalizable Long-Horizon Manipulations with Large Language Models | Haoyu Zhou et.al. | 2310.02264v1 | null |
2023-10-03 | MathVista: Evaluating Mathematical Reasoning of Foundation Models in Visual Contexts | Pan Lu et.al. | 2310.02255v1 | null |
2023-10-03 | Talk2BEV: Language-enhanced Bird’s-eye View Maps for Autonomous Driving | Vikrant Dewangan et.al. | 2310.02251v1 | null |
2023-10-03 | Hierarchical Generation of Human-Object Interactions with Diffusion Probabilistic Models | Huaijin Pi et.al. | 2310.02242v1 | null |
2023-10-03 | MiniGPT-5: Interleaved Vision-and-Language Generation via Generative Vokens | Kaizhi Zheng et.al. | 2310.02239v1 | link |
2023-09-29 | Efficient Streaming Language Models with Attention Sinks | Guangxuan Xiao et.al. | 2309.17453v1 | link |
2023-10-02 | L2CEval: Evaluating Language-to-Code Generation Capabilities of Large Language Models | Ansong Ni et.al. | 2309.17446v2 | null |
2023-10-02 | LLM-grounded Video Diffusion Models | Long Lian et.al. | 2309.17444v2 | null |
2023-09-29 | CRAFT: Customizing LLMs by Creating and Retrieving from Specialized Toolsets | Lifan Yuan et.al. | 2309.17428v1 | link |
2023-09-28 | Learning to Transform for Generalizable Instance-wise Invariance | Utkarsh Singhal et.al. | 2309.16672v1 | link |
2023-09-29 | Demystifying CLIP Data | Hu Xu et.al. | 2309.16671v2 | link |
2023-09-28 | RealFill: Reference-Driven Generation for Authentic Image Completion | Luming Tang et.al. | 2309.16668v1 | null |
2023-09-28 | DreamGaussian: Generative Gaussian Splatting for Efficient 3D Content Creation | Jiaxiang Tang et.al. | 2309.16653v1 | link |
2023-09-27 | Exploiting the Signal-Leak Bias in Diffusion Models | Martin Nicolas Everaert et.al. | 2309.15842v1 | null |
2023-09-27 | OrthoPlanes: A Novel Representation for Better 3D-Awareness of GANs | Honglin He et.al. | 2309.15830v1 | null |
2023-09-27 | LGMCTS: Language-Guided Monte-Carlo Tree Search for Executable Semantic Object Rearrangement | Haonan Chang et.al. | 2309.15821v1 | null |
2023-09-27 | Show-1: Marrying Pixel and Latent Diffusion Models for Text-to-Video Generation | David Junhao Zhang et.al. | 2309.15818v1 | link |
2023-09-26 | Generating Visual Scenes from Touch | Fengyu Yang et.al. | 2309.15117v1 | null |
2023-09-27 | InternLM-XComposer: A Vision-Language Large Model for Advanced Text-image Comprehension and Composition | Pan Zhang et.al. | 2309.15112v2 | link |
2023-09-26 | Doduo: Learning Dense Visual Correspondence from Unsupervised Semantic-Aware Flow | Zhenyu Jiang et.al. | 2309.15110v1 | null |
2023-09-26 | DistillBEV: Boosting Multi-Camera 3D Object Detection with Cross-Modal Knowledge Distillation | Zeyu Wang et.al. | 2309.15109v1 | link |
2023-09-26 | New solution to Airy’s equation for modeling beams near turning points | N. A. Lopez et.al. | 2309.15108v1 | null |
2023-09-25 | Extreme Parkour with Legged Robots | Xuxin Cheng et.al. | 2309.14341v1 | null |
2023-09-25 | Chop & Learn: Recognizing and Generating Object-State Compositions | Nirat Saini et.al. | 2309.14339v1 | null |
2023-09-25 | UnitedHuman: Harnessing Multi-Source Data for High-Resolution Human Generation | Jianglin Fu et.al. | 2309.14335v1 | link |
2023-09-25 | Tasks Makyth Models: Machine Learning Assisted Surrogates for Tipping Points | Gianluca Fabiani et.al. | 2309.14334v1 | null |
2023-09-25 | Innovative Digital Storytelling with AIGC: Exploration and Discussion of Recent Advances | Rongzhang Gu et.al. | 2309.14329v1 | null |
2023-09-25 | pyParaOcean: A System for Visual Analysis of Ocean Data | Toshit Jain et.al. | 2309.14328v1 | null |
2023-09-22 | E(2)-Equivariant Graph Planning for Navigation | Linfeng Zhao et.al. | 2309.13043v1 | null |
2023-09-22 | MosaicFusion: Diffusion Models as Data Augmenters for Large Vocabulary Instance Segmentation | Jiahao Xie et.al. | 2309.13042v1 | link |
2023-09-22 | Robotic Offline RL from Internet Videos via Value-Function Pre-Training | Chethan Bhateja et.al. | 2309.13041v1 | null |
2023-09-22 | Privacy Assessment on Reconstructed Images: Are Existing Evaluation Metrics Faithful to Human Perception? | Xiaoxiao Sun et.al. | 2309.13038v1 | null |
2023-09-22 | GELLO: A General, Low-Cost, and Intuitive Teleoperation Framework for Robot Manipulators | Philipp Wu et.al. | 2309.13037v1 | null |
2023-09-22 | A numerical framework for simulating progressive failure in composite laminates under high-cycle fatigue loading | Pieter Hofman et.al. | 2309.13030v1 | null |
2023-09-21 | LLM-Grounder: Open-Vocabulary 3D Visual Grounding with Large Language Model as an Agent | Jianing Yang et.al. | 2309.12311v1 | null |
2023-09-21 | Rehearsal: Simulating Conflict to Teach Conflict Resolution | Omar Shaikh et.al. | 2309.12309v1 | null |
2023-09-21 | Text-Guided Vector Graphics Customization | Peiying Zhang et.al. | 2309.12302v1 | null |
2023-09-21 | Environment-biased Feature Ranking for Novelty Detection Robustness | Stefan Smeu et.al. | 2309.12301v1 | null |
2023-09-21 | Reranking for Natural Language Generation from Logical Forms: A Study based on Large Language Models | Levon Haroutunian et.al. | 2309.12294v1 | null |
2023-09-20 | A Large-scale Dataset for Audio-Language Representation Learning | Luoyi Sun et.al. | 2309.11500v1 | null |
2023-09-20 | DreamLLM: Synergistic Multimodal Comprehension and Creation | Runpei Dong et.al. | 2309.11499v1 | link |
2023-09-20 | FreeU: Free Lunch in Diffusion U-Net | Chenyang Si et.al. | 2309.11497v1 | link |
2023-09-20 | Chain-of-Verification Reduces Hallucination in Large Language Models | Shehzaad Dhuliawala et.al. | 2309.11495v1 | null |
2023-09-21 | Text2Reward: Automated Dense Reward Function Generation for Reinforcement Learning | Tianbao Xie et.al. | 2309.11489v2 | link |
2023-09-19 | PanopticNeRF-360: Panoramic 3D-to-2D Label Transfer in Urban Scenes | Xiao Fu et.al. | 2309.10815v1 | link |
2023-09-19 | Natural Language Embedded Programs for Hybrid Language Symbolic Reasoning | Tianhua Zhang et.al. | 2309.10814v1 | link |
2023-09-19 | PGDiff: Guiding Diffusion Models for Versatile Face Restoration via Partial Guidance | Peiqing Yang et.al. | 2309.10810v1 | link |
2023-09-20 | AI Foundation Models for Weather and Climate: Applications, Design, and Implementation | S. Karthik Mukkavilli et.al. | 2309.10808v2 | null |
2023-09-19 | Heuristic Search for Path Finding with Refuelling | Anushtup Nandy et.al. | 2309.10796v1 | null |
2023-09-19 | Guide Your Agent with Adaptive Multimodal Rewards | Changyeon Kim et.al. | 2309.10790v1 | link |
2023-09-18 | General In-Hand Object Rotation with Vision and Touch | Haozhi Qi et.al. | 2309.09979v1 | null |
2023-09-18 | GEDepth: Ground Embedding for Monocular Depth Estimation | Xiaodong Yang et.al. | 2309.09975v1 | link |
2023-09-19 | MindAgent: Emergent Gaming Interaction | Ran Gong et.al. | 2309.09971v2 | null |
2023-09-18 | Empirical Study of Mix-based Data Augmentation Methods in Physiological Time Series Data | Peikun Guo et.al. | 2309.09970v1 | link |
2023-09-18 | Prompt a Robot to Walk with Large Language Models | Yen-Jen Wang et.al. | 2309.09969v1 | link |
2023-09-18 | Generating and Imputing Tabular Data via Diffusion and Flow-based Gradient-Boosted Trees | Alexia Jolicoeur-Martineau et.al. | 2309.09968v1 | link |
2023-09-15 | Robust e-NeRF: NeRF from Sparse & Noisy Events under Non-Uniform Motion | Weng Fei Low et.al. | 2309.08596v1 | link |
2023-09-15 | Chain-of-Thought Reasoning is a Policy Improvement Operator | Hugh Zhang et.al. | 2309.08589v1 | null |
2023-09-15 | Robust Frame-to-Frame Camera Rotation Estimation in Crowded Scenes | Fabien Delattre et.al. | 2309.08588v1 | null |
2023-09-15 | Compositional Foundation Models for Hierarchical Planning | Anurag Ajay et.al. | 2309.08587v1 | null |
2023-09-15 | Viewpoint Integration and Registration with Vision Language Foundation Model for Image Change Understanding | Xiaonan Lu et.al. | 2309.08585v1 | null |
2023-09-15 | ICLEF: In-Context Learning with Expert Feedback for Explainable Style Transfer | Arkadiy Saakyan et.al. | 2309.08583v1 | link |
2023-09-15 | Large-Vocabulary 3D Diffusion Model with Transformer | Ziang Cao et.al. | 2309.07920v2 | null |
2023-09-14 | Unified Human-Scene Interaction via Prompted Chain-of-Contacts | Zeqi Xiao et.al. | 2309.07918v1 | link |
2023-09-14 | Looking at words and points with attention: a benchmark for text-to-shape coherence | Andrea Amaduzzi et.al. | 2309.07917v1 | null |
2023-09-14 | MMICL: Empowering Vision-language Model with Multi-Modal In-Context Learning | Haozhe Zhao et.al. | 2309.07915v1 | link |
2023-09-14 | ALWOD: Active Learning for Weakly-Supervised Object Detection | Yuting Wang et.al. | 2309.07914v1 | link |
2023-09-14 | Why would you put a flashlight in a dark matter detector? | R. Gibbons et.al. | 2309.07913v1 | null |
2023-09-14 | TEMPO: Efficient Multi-View Pose Estimation, Tracking, and Forecasting | Rohan Choudhury et.al. | 2309.07910v1 | null |
2023-09-14 | Physically Plausible Full-Body Hand-Object Interaction Synthesis | Jona Braun et.al. | 2309.07907v1 | null |
2023-09-14 | Generative Image Dynamics | Zhengqi Li et.al. | 2309.07906v1 | null |
2023-09-13 | Text-Guided Generation and Editing of Compositional 3D Avatars | Hao Zhang et.al. | 2309.07125v1 | null |
2023-09-13 | RAIN: Your Language Models Can Align Themselves without Finetuning | Yuhui Li et.al. | 2309.07124v1 | link |
2023-09-13 | Tree-Structured Shading Decomposition | Chen Geng et.al. | 2309.07122v1 | null |
2023-09-13 | Sight Beyond Text: Multi-Modal Training Enhances LLMs in Truthfulness and Ethics | Haoqin Tu et.al. | 2309.07120v1 | link |
2023-09-13 | Weakly-Supervised Multi-Task Learning for Audio-Visual Speaker Verification | Anith Selvakumar et.al. | 2309.07115v1 | null |
2023-09-13 | Contrastive Deep Encoding Enables Uncertainty-aware Machine-learning-assisted Histopathology | Nirhoshan Sivaroopan et.al. | 2309.07113v1 | null |
2023-09-13 | Hardening RGB-D Object Recognition Systems against Adversarial Patch Attacks | Yang Zheng et.al. | 2309.07106v1 | null |
2023-09-12 | Learning Disentangled Avatars with Hybrid 3D Representations | Yao Feng et.al. | 2309.06441v1 | null |
2023-09-12 | Unveiling the potential of large language models in generating semantic and cross-language clones | Palash R. Roy et.al. | 2309.06424v1 | null |
2023-09-12 | C4CAM: A Compiler for CAM-based In-memory Accelerators | Hamid Farzaneh et.al. | 2309.06418v1 | null |
2023-09-12 | Robot Parkour Learning | Ziwen Zhuang et.al. | 2309.05665v2 | null |
2023-09-11 | Diffusion-Guided Reconstruction of Everyday Hand-Object Interaction Clips | Yufei Ye et.al. | 2309.05663v1 | null |
2023-09-11 | ViHOPE: Visuotactile In-Hand Object 6D Pose Estimation with Shape Completion | Hongyu Li et.al. | 2309.05662v1 | null |
2023-09-11 | Hypothesis Search: Inductive Reasoning with Language Models | Ruocheng Wang et.al. | 2309.05660v1 | null |
2023-09-11 | From Capture to Display: A Survey on Volumetric Video | Yili Jin et.al. | 2309.05658v1 | null |
2023-09-11 | MAmmoTH: Building Math Generalist Models through Hybrid Instruction Tuning | Xiang Yue et.al. | 2309.05653v1 | null |
2023-09-11 | Data efficiency, dimensionality reduction, and the generalized symmetric information bottleneck | K. Michael Martini et.al. | 2309.05649v1 | null |
2023-09-08 | On the Actionability of Outcome Prediction | Lydia T. Liu et.al. | 2309.04470v1 | null |
2023-09-08 | Generalized Cross-domain Multi-label Few-shot Learning for Chest X-rays | Aroof Aimen et.al. | 2309.04462v1 | null |
2023-09-08 | Measuring and Improving Chain-of-Thought Reasoning in Vision-Language Models | Yangyi Chen et.al. | 2309.04461v1 | link |
2023-09-08 | Subwords as Skills: Tokenization for Sparse-Reward Reinforcement Learning | David Yunis et.al. | 2309.04459v1 | null |
2023-09-08 | Effect of Electron-Phonon Interactions on Three-Level QD-based Spaser: Linear and Quadratic Potentials | Ankit Purohit et.al. | 2309.04448v1 | null |
2023-09-07 | ImageBind-LLM: Multi-modality Instruction Tuning | Jiaming Han et.al. | 2309.03905v1 | link |
2023-09-07 | Exploring Sparse MoE in GANs for Text-conditioned Image Synthesis | Jiapeng Zhu et.al. | 2309.03904v1 | link |
2023-09-07 | Tracking Anything with Decoupled Video Segmentation | Ho Kei Cheng et.al. | 2309.03903v1 | link |
2023-09-07 | The Making and Breaking of Camouflage | Hala Lamdouar et.al. | 2309.03899v1 | null |
2023-09-07 | InstructDiffusion: A Generalist Modeling Interface for Vision Tasks | Zigang Geng et.al. | 2309.03895v1 | null |
2023-09-07 | DiffusionEngine: Diffusion Model is Scalable Data Engine for Object Detection | Manlin Zhang et.al. | 2309.03893v1 | null |
2023-09-07 | ArtiGrasp: Physically Plausible Synthesis of Bi-Manual Dexterous Grasping and Articulation | Hui Zhang et.al. | 2309.03891v1 | null |
2023-09-06 | My Art My Choice: Adversarial Protection Against Unruly AI | Anthony Rhodes et.al. | 2309.03198v1 | null |
2023-09-06 | Electrocaloric Response of the Dense Ferroelectric Nanocomposites | Anna N. Morozovska et.al. | 2309.03187v1 | null |
2023-09-06 | SLiMe: Segment Like Me | Aliasghar Khani et.al. | 2309.03179v1 | link |
2023-09-05 | ReliTalk: Relightable Talking Portrait Generation from a Single Video | Haonan Qiu et.al. | 2309.02434v1 | link |
2023-09-05 | Generating Realistic Images from In-the-wild Sounds | Taegyeong Lee et.al. | 2309.02405v1 | null |
2023-09-01 | OpenIns3D: Snap and Lookup for 3D Open-vocabulary Instance Segmentation | Zhening Huang et.al. | 2309.00616v1 | link |
2023-09-01 | Point-Bind & Point-LLM: Aligning Point Cloud with Multi-modality for 3D Understanding, Generation, and Instruction Following | Ziyu Guo et.al. | 2309.00615v1 | link |
2023-09-01 | Iterative Multi-granular Image Editing using Diffusion Models | K J Joseph et.al. | 2309.00613v1 | null |
2023-09-01 | CityDreamer: Compositional Generative Model of Unbounded 3D Cities | Haozhe Xie et.al. | 2309.00610v1 | link |
2023-09-01 | Copiloting the Copilots: Fusing Large Language Models with Completion Engines for Automated Program Repair | Yuxiang Wei et.al. | 2309.00608v1 | link |
2023-08-31 | PointLLM: Empowering Large Language Models to Understand Point Clouds | Runsen Xu et.al. | 2308.16911v1 | link |
2023-08-31 | StyleInV: A Temporal Style Modulated Inversion Network for Unconditional Video Generation | Yuhan Wang et.al. | 2308.16909v1 | link |
2023-08-31 | Fine-Grained Cross-View Geo-Localization Using a Correlation-Aware Homography Estimator | Xiaolong Wang et.al. | 2308.16906v1 | link |
2023-08-31 | InterDiff: Generating 3D Human-Object Interactions with Physics-Informed Diffusion | Sirui Xu et.al. | 2308.16905v1 | link |
2023-08-31 | Transformers as Support Vector Machines | Davoud Ataee Tarzanagh et.al. | 2308.16898v1 | link |
2023-09-01 | GNFactor: Multi-Task Real Robot Learning with Generalizable Neural Feature Fields | Yanjie Ze et.al. | 2308.16891v2 | link |
2023-08-31 | Prediction of Diblock Copolymer Morphology via Machine Learning | Hyun Park et.al. | 2308.16886v1 | null |
2023-08-30 | Learning Vision-based Pursuit-Evasion Robot Policies | Andrea Bajcsy et.al. | 2308.16185v1 | null |
2023-08-30 | SAM-Med2D | Junlong Cheng et.al. | 2308.16184v1 | link |
2023-08-30 | GREC: Generalized Referring Expression Comprehension | Shuting He et.al. | 2308.16182v1 | link |
2023-08-30 | Framework and Methodology for Verification of a Complex Scientific Simulation Software, Flash-X | Akash Dhruv et.al. | 2308.16180v1 | null |
2023-08-30 | General Purpose Audio Effect Removal | Matthew Rice et.al. | 2308.16177v1 | link |
2023-08-30 | Quantifying Uncertainty in Answers from any Language Model via Intrinsic and Extrinsic Confidence Assessment | Jiuhai Chen et.al. | 2308.16175v1 | null |
2023-08-29 | 3D Adversarial Augmentations for Robust Out-of-Domain Predictions | Alexander Lehner et.al. | 2308.15479v1 | null |
2023-08-29 | A General-Purpose Self-Supervised Model for Computational Pathology | Richard J. Chen et.al. | 2308.15474v1 | null |
2023-08-29 | Learning Modulated Transformation in GANs | Ceyuan Yang et.al. | 2308.15472v1 | link |
2023-08-29 | Input margins can predict generalization too | Coenraad Mouton et.al. | 2308.15466v1 | null |
2023-08-30 | Sharing proofs with predicative theories through universe polymorphic elaboration | Thiago Felicissimo et.al. | 2308.15465v2 | link |
2023-08-29 | ParaGuide: Guided Diffusion Paraphrasers for Plug-and-Play Textual Style Transfer | Zachary Horvitz et.al. | 2308.15459v1 | link |
2023-08-29 | From SMOTE to Mixup for Deep Imbalanced Classification | Wei-Chao Cheng et.al. | 2308.15457v1 | link |
2023-08-28 | AI Deception: A Survey of Examples, Risks, and Potential Solutions | Peter S. Park et.al. | 2308.14752v1 | null |
2023-08-28 | MagicAvatar: Multimodal Avatar Generation and Animation | Jianfeng Zhang et.al. | 2308.14748v1 | null |
2023-08-28 | CoVR: Learning Composed Video Retrieval from Web Video Captions | Lucas Ventura et.al. | 2308.14746v1 | link |
2023-08-28 | Advancement on Security Applications of Private Intersection Sum Protocol | Yuvaray Athur Raghuvir et.al. | 2308.14741v1 | null |
2023-08-28 | Total Selfie: Generating Full-Body Selfies | Bowei Chen et.al. | 2308.14740v1 | null |
2023-08-28 | Bayesian artificial brain with ChatGPT | Renato A. Krohling et.al. | 2308.14732v1 | null |
2023-08-28 | Distilled GPT for Source Code Summarization | Chia-Yi Su et.al. | 2308.14731v1 | link |
2023-08-25 | ChatGPT as Data Augmentation for Compositional Generalization: A Case Study in Open Intent Detection | Yihao Fang et.al. | 2308.13517v1 | link |
2023-08-25 | Does Asking Clarifying Questions Increases Confidence in Generated Code? On the Communication Skills of Large Language Models | Jie JW Wu et.al. | 2308.13507v1 | null |
2023-08-25 | A2Q: Accumulator-Aware Quantization with Guaranteed Overflow Avoidance | Ian Colbert et.al. | 2308.13504v1 | null |
2023-08-25 | Attending Generalizability in Course of Deep Fake Detection by Exploring Multi-task Learning | Pranav Balaji et.al. | 2308.13503v1 | null |
2023-08-24 | ROAM: Robust and Object-aware Motion Generation using Neural Pose Descriptors | Wanyue Zhang et.al. | 2308.12969v1 | null |
2023-08-24 | Dense Text-to-Image Generation with Attention Modulation | Yunji Kim et.al. | 2308.12964v1 | link |
2023-08-24 | MapPrior: Bird’s-Eye View Map Layout Estimation with Generative Models | Xiyue Zhu et.al. | 2308.12963v1 | null |
2023-08-24 | Motion-Guided Masking for Spatiotemporal Representation Learning | David Fan et.al. | 2308.12962v1 | null |
2023-08-24 | Less is More: Towards Efficient Few-shot 3D Semantic Segmentation via Training-free Networks | Xiangyang Zhu et.al. | 2308.12961v1 | link |
2023-08-24 | Towards Realistic Zero-Shot Classification via Self Structural Semantic Alignment | Sheng Zhang et.al. | 2308.12960v1 | link |
2023-08-24 | Semi-analytical Framework for Modeling Strong Coupling of Quantum Emitters in Electromagnetic Resonators | Mohammad Abutoama et.al. | 2308.12957v1 | null |
2023-08-24 | A new framework for global data regulation | Ellie Graeden et.al. | 2308.12955v1 | null |
2023-08-24 | BridgeData V2: A Dataset for Robot Learning at Scale | Homer Walke et.al. | 2308.12952v1 | link |
2023-08-24 | Label Budget Allocation in Multi-Task Learning | Ximeng Sun et.al. | 2308.12949v1 | null |
2023-08-23 | CHORUS: Learning Canonicalized 3D Human-Object Spatial Relations from Unbounded Synthesized Images | Sookwan Han et.al. | 2308.12288v1 | null |
2023-08-23 | Devising and Detecting Phishing: large language models vs. Smaller Human Models | Fredrik Heiding et.al. | 2308.12287v1 | null |
2023-08-23 | On-Manifold Projected Gradient Descent | Aaron Mahler et.al. | 2308.12279v1 | null |
2023-08-24 | A Model for Integrating Generative AI into Course Content Development | Ethan Dickey et.al. | 2308.12276v2 | null |
2023-08-23 | Spatial clustering of temporal energy profiles with empirical orthogonal functions and max-p regionalization | Claire Halloran et.al. | 2308.12274v1 | null |
2023-08-23 | Simple is Better and Large is Not Enough: Towards Ensembling of Foundational Language Models | Nancy Tyagi et.al. | 2308.12272v1 | null |
2023-08-23 | A Generative Approach for Image Registration of Visible-Thermal (VT) Cancer Faces | Catherine Ordun et.al. | 2308.12271v1 | null |
2023-08-23 | Language Reward Modulation for Pretraining Reinforcement Learning | Ademi Adeniji et.al. | 2308.12270v1 | link |
2023-08-22 | GRIP: Generating Interaction Poses Using Latent Consistency and Spatial Cues | Omid Taheri et.al. | 2308.11617v1 | null |
2023-08-22 | StoryBench: A Multifaceted Benchmark for Continuous Story Visualization | Emanuele Bugliarello et.al. | 2308.11606v1 | link |
2023-08-22 | GOPro: Generate and Optimize Prompts in CLIP using Self-Supervised Learning | Mainak Singha et.al. | 2308.11605v1 | null |
2023-08-22 | Towards Universal Interaction for Extended Reality | Pascal Knierim et.al. | 2308.11600v1 | null |
2023-08-22 | Theory of Transverse Mode Instability in Fiber Amplifiers with Multimode Excitations | Kabish Wisal et.al. | 2308.11599v1 | null |
2023-08-22 | Vision-Based Intelligent Robot Grasping Using Sparse Neural Network | Priya Shukla et.al. | 2308.11590v1 | null |
2023-08-21 | Structured World Models from Human Videos | Russell Mendonca et.al. | 2308.10901v1 | null |
2023-08-21 | TADA! Text to Animatable Digital Avatars | Tingting Liao et.al. | 2308.10899v1 | null |
2023-08-21 | Few-Shot Physically-Aware Articulated Mesh Generation via Hierarchical Deformation | Xueyi Liu et.al. | 2308.10898v1 | link |
2023-08-21 | Can Language Models Learn to Listen? | Evonne Ng et.al. | 2308.10897v1 | null |
2023-08-21 | Differentiable Shadow Mapping for Efficient Inverse Graphics | Markus Worchel et.al. | 2308.10896v1 | link |
2023-08-21 | Proton-Boron Fusion Yield Increased by Orders of Magnitude with Foam Targets | Wen-Qing Wei et.al. | 2308.10878v1 | null |
2023-08-21 | Analyzing Transformer Dynamics as Movement through Embedding Space | Sumeet S. Singh et.al. | 2308.10874v1 | null |
2023-08-18 | HumanLiff: Layer-wise 3D Human Generation with Diffusion Model | Shoukang Hu et.al. | 2308.09712v1 | null |
2023-08-18 | Robust Monocular Depth Estimation under Challenging Conditions | Stefano Gasperini et.al. | 2308.09711v1 | null |
2023-08-18 | SimDA: Simple Diffusion Adapter for Efficient Video Generation | Zhen Xing et.al. | 2308.09710v1 | null |
2023-08-18 | Training with Product Digital Twins for AutoRetail Checkout | Yue Yao et.al. | 2308.09708v1 | link |
2023-08-18 | Guide3D: Create 3D Avatars from Text and Image Guidance | Yukang Cao et.al. | 2308.09705v1 | null |
2023-08-18 | Counting and Sampling Labeled Chordal Graphs in Polynomial Time | Ursula Hebert-Johnson et.al. | 2308.09703v1 | null |
2023-08-16 | TeCH: Text-guided Reconstruction of Lifelike Clothed Humans | Yangyi Huang et.al. | 2308.08545v1 | link |
2023-08-16 | InsightMapper: A Closer Look at Inner-instance Information for Vectorized High-Definition Mapping | Zhenhua Xu et.al. | 2308.08543v1 | null |
2023-08-15 | Enumerating Tarski fixed points on lattices of binary relations | Julian Müller et.al. | 2308.07923v1 | null |
2023-08-15 | Solving Challenging Math Word Problems Using GPT-4 Code Interpreter with Code-based Self-Verification | Aojun Zhou et.al. | 2308.07921v1 | null |
2023-08-15 | The Regular Expression Inference Challenge | Mojtaba Valizadeh et.al. | 2308.07899v1 | null |
2023-08-15 | A Foundation LAnguage-Image model of the Retina (FLAIR): Encoding expert knowledge in text supervision | Julio Silva-Rodriguez et.al. | 2308.07898v1 | link |
2023-08-14 | Jurassic World Remake: Bringing Ancient Fossils Back to Life via Zero-Shot Long Image-to-Image Translation | Alexander Martin et.al. | 2308.07316v1 | link |
2023-08-14 | Reinforcing Security and Usability of Crypto-Wallet with Post-Quantum Cryptography and Zero-Knowledge Proof | Yathin Kethepalli et.al. | 2308.07309v1 | null |
2023-08-15 | LLM Self Defense: By Self Examination, LLMs Know They Are Being Tricked | Alec Helbling et.al. | 2308.07308v2 | null |
2023-08-14 | Extend Wave Function Collapse to Large-Scale Content Generation | Yuhe Nie et.al. | 2308.07307v1 | null |
2023-08-14 | Neural Authorship Attribution: Stylometric Analysis on Large Language Models | Tharindu Kumarage et.al. | 2308.07305v1 | link |
2023-08-14 | DiffSED: Sound Event Detection with Denoising Diffusion | Swapnil Bhosale et.al. | 2308.07293v1 | null |
2023-08-11 | Foundation Model is Efficient Multimodal Multitask Model Selector | Fanqing Meng et.al. | 2308.06262v1 | link |
2023-08-11 | Enhancing Network Management Using Code Generated by Large Language Models | Sathiya Kumaran Mani et.al. | 2308.06261v1 | link |
2023-08-11 | Self-Alignment with Instruction Backtranslation | Xian Li et.al. | 2308.06259v1 | null |
2023-08-11 | NEMA NU 2-2018 performance evaluation of a new generation digital 32-cm axial field-of-view Omni Legend PET-CT | Rhodri Lyn Smith et.al. | 2308.06255v1 | null |
2023-08-11 | Fundamental Limits on Subwavelength Range Resolution | Andrew N. Jordan et.al. | 2308.06252v1 | null |
2023-08-11 | ARGUS: Visualization of AI-Assisted Task Guidance in AR | Sonia Castelo et.al. | 2308.06246v1 | null |
2023-08-10 | PlankAssembly: Robust 3D Reconstruction from Three Orthographic Views with Learnt Shape Programs | Wentao Hu et.al. | 2308.05744v1 | link |
2023-08-10 | Neural Progressive Meshes | Yun-Chun Chen et.al. | 2308.05741v1 | null |
2023-08-10 | AudioLDM 2: Learning Holistic Audio Generation with Self-supervised Pretraining | Haohe Liu et.al. | 2308.05734v1 | link |
2023-08-10 | FrozenRecon: Pose-free 3D Scene Reconstruction with Frozen Depth Models | Guangkai Xu et.al. | 2308.05733v1 | null |
2023-08-09 | Scene-Generalizable Interactive Segmentation of Radiance Fields | Songlin Tang et.al. | 2308.05104v1 | null |
2023-08-09 | LayoutLLM-T2I: Eliciting Layout Guidance from LLM for Text-to-Image Generation | Leigang Qu et.al. | 2308.05095v1 | null |
2023-08-08 | SILO Language Models: Isolating Legal Risk In a Nonparametric Datastore | Sewon Min et.al. | 2308.04430v1 | link |
2023-08-08 | A Deep-Learning Method Using Auto-encoder and Generative Adversarial Network for Anomaly Detection on Ancient Stone Stele Surfaces | Yikun Liu et.al. | 2308.04426v1 | null |
2023-08-08 | Density-contrast induced inertial forces on particles in oscillatory flows | Siddhansh Agarwal et.al. | 2308.04423v1 | null |
2023-08-08 | Near-field 6G Networks: Why Mobile Terahertz Communications MUST Operate in the Near Field | Vitaly Petrov et.al. | 2308.04418v1 | null |
2023-08-08 | DiffCR: A Fast Conditional Diffusion Framework for Cloud Removal from Optical Satellite Images | Xuechao Zou et.al. | 2308.04417v1 | link |
2023-08-07 | FSD V2: Improving Fully Sparse 3D Object Detection with Virtual Voxels | Lue Fan et.al. | 2308.03755v1 | link |
2023-08-07 | Mask Frozen-DETR: High Quality Instance Segmentation with One GPU | Zhanhao Liang et.al. | 2308.03747v1 | null |
2023-08-07 | A Cost Analysis of Generative Language Models and Influence Operations | Micah Musser et.al. | 2308.03740v1 | link |
2023-08-07 | Labeling without Seeing? Blind Annotation for Privacy-Preserving Entity Resolution | Yixiang Yao et.al. | 2308.03734v1 | null |
2023-08-07 | SurvBeX: An explanation method of the machine learning survival models based on the Beran estimator | Lev V. Utkin et.al. | 2308.03730v1 | link |
2023-08-04 | Recovering non-Maxwellian particle velocity distribution functions from collective Thomson-scattered spectra | Bryan C. Foo et.al. | 2308.02488v1 | null |
2023-08-04 | Convolutions Die Hard: Open-Vocabulary Segmentation with Single Frozen Convolutional CLIP | Qihang Yu et.al. | 2308.02487v1 | link |
2023-08-04 | On the Inherent Anonymity of Gossiping | Rachid Guerraoui et.al. | 2308.02477v1 | null |
2023-08-04 | Towards Generalist Foundation Model for Radiology | Chaoyi Wu et.al. | 2308.02463v1 | link |
2023-08-04 | Getting the Ball Rolling: Learning a Dexterous Policy for a Biomimetic Tendon-Driven Hand with Rolling Contact Joints | Yasunori Toshimitsu et.al. | 2308.02453v1 | link |
2023-08-03 | The All-Seeing Project: Towards Panoptic Visual Recognition and Understanding of the Open World | Weiyun Wang et.al. | 2308.01907v1 | link |
2023-08-03 | Revisiting Deformable Convolution for Depth Completion | Xinglong Sun et.al. | 2308.01905v1 | link |
2023-08-03 | UniSim: A Neural Closed-Loop Sensor Simulator | Ze Yang et.al. | 2308.01898v1 | null |
2023-08-03 | Strategies for optimizing plasmonic grating couplers with topology-based inverse design | Michael Efseaff et.al. | 2308.01893v1 | null |
2023-08-02 | ELIXR: Towards a general purpose X-ray artificial intelligence system through alignment of large language models and radiology vision encoders | Shawn Xu et.al. | 2308.01317v1 | null |
2023-08-02 | Patched Denoising Diffusion Models For High-Resolution Image Synthesis | Zheng Ding et.al. | 2308.01316v1 | link |
2023-08-02 | More Context, Less Distraction: Visual Classification by Inferring and Conditioning on Contextual Attributes | Bang An et.al. | 2308.01313v1 | link |
2023-08-02 | TEASMA: A Practical Approach for the Test Assessment of Deep Neural Networks using Mutation Analysis | Amin Abbasishahkoo et.al. | 2308.01311v1 | null |
2023-08-02 | Revisiting DETR Pre-training for Object Detection | Yan Ma et.al. | 2308.01300v1 | null |
2023-08-01 | LISA: Reasoning Segmentation via Large Language Model | Xin Lai et.al. | 2308.00692v1 | link |
2023-08-01 | AnyLoc: Towards Universal Visual Place Recognition | Nikhil Keetha et.al. | 2308.00688v1 | link |
2023-08-01 | Learning from Hypervectors: A Survey on Hypervector Encoding | Sercan Aygun et.al. | 2308.00685v1 | null |
2023-07-31 | Conformal PID Control for Time Series Prediction | Anastasios N. Angelopoulos et.al. | 2307.16895v1 | link |
2023-07-31 | A reduced order model for geometrically parameterized two-scale simulations of elasto-plastic microstructures under large deformations | Theron Guo et.al. | 2307.16894v1 | null |
2023-07-31 | LEONARDO: A Pan-European Pre-Exascale Supercomputer for HPC and AI Applications | Matteo Turisini et.al. | 2307.16885v1 | null |
2023-07-31 | HAGRID: A Human-LLM Collaborative Dataset for Generative Information-Seeking with Attribution | Ehsan Kamalloo et.al. | 2307.16883v1 | link |
2023-07-31 | Image Synthesis under Limited Data: A Survey and Taxonomy | Mengping Yang et.al. | 2307.16879v1 | link |
2023-07-31 | Revisiting the Parameter Efficiency of Adapters from the Perspective of Precision Redundancy | Shibo Jie et.al. | 2307.16867v1 | link |
2023-07-28 | Uncertainty in Natural Language Generation: From Theory to Applications | Joris Baan et.al. | 2307.15703v1 | null |
2023-07-28 | The Strong Maximum Circulation Algorithm: A New Method for Aggregating Preference Rankings | Nathan Atkinson et.al. | 2307.15702v1 | null |
2023-07-31 | MeMOTR: Long-Term Memory-Augmented Transformer for Multi-Object Tracking | Ruopeng Gao et.al. | 2307.15700v2 | link |
2023-07-28 | PatchMixer: Rethinking network design to boost generalization for 3D point cloud understanding | Davide Boscaini et.al. | 2307.15692v1 | link |
2023-07-28 | Benchmarking Offline Reinforcement Learning on Real-Robot Hardware | Nico Gürtler et.al. | 2307.15690v1 | link |
2023-07-27 | PointOdyssey: A Large-Scale Synthetic Dataset for Long-Term Point Tracking | Yang Zheng et.al. | 2307.15055v1 | link |
2023-07-27 | A Geometric Notion of Causal Probing | Clément Guerner et.al. | 2307.15054v1 | null |
2023-07-27 | A Transformer-based Approach for Arabic Offline Handwritten Text Recognition | Saleh Momeni et.al. | 2307.15045v1 | null |
2023-07-27 | Universal and Transferable Adversarial Attacks on Aligned Language Models | Andy Zou et.al. | 2307.15043v1 | link |
2023-07-27 | 3-Coloring $C_4$ or $C_3$ -free Diameter Two Graphs | Tereza Klimošová et.al. | 2307.15036v1 | null |
2023-07-26 | WavJourney: Compositional Audio Creation with Large Language Models | Xubo Liu et.al. | 2307.14335v1 | link |
2023-07-26 | Towards Generalist Biomedical AI | Tao Tu et.al. | 2307.14334v1 | null |
2023-07-26 | Waypoint-Based Imitation Learning for Robotic Manipulation | Lucy Xiaoyang Shi et.al. | 2307.14326v1 | null |
2023-07-25 | Benchmarking and Analyzing Generative Data for Visual Recognition | Bo Li et.al. | 2307.13697v1 | null |
2023-07-25 | A Compact DAG for Storing and Searching Maximal Common Subsequences | Alessio Conte et.al. | 2307.13695v1 | null |
2023-07-25 | A Comprehensive Review of Recent Research Trends on UAVs | Kaled Telli et.al. | 2307.13691v1 | null |
2023-07-25 | Single reference treatment of strongly correlated H $4$ and H${10}$ isomers with Richardson-Gaudin states | Paul A. Johnson et.al. | 2307.13690v1 | null |
2023-07-25 | All-optical GeV electron bunch generation in a laser-plasma accelerator via truncated-channel injection | A. Picksley et.al. | 2307.13689v1 | null |
2023-07-25 | The Visual Language of Fabrics | Valentin Deschaintre et.al. | 2307.13681v1 | null |
2023-07-25 | High Probability Analysis for Non-Convex Stochastic Optimization with Clipping | Shaojie Li et.al. | 2307.13680v1 | null |
2023-07-24 | A Systematic Survey of Prompt Engineering on Vision-Language Foundation Models | Jindong Gu et.al. | 2307.12980v1 | link |
2023-07-24 | Evaluating the Ripple Effects of Knowledge Editing in Language Models | Roi Cohen et.al. | 2307.12976v1 | link |
2023-07-24 | Volcanic ash delimitation using Artificial Intelligence based on Pix2Pix | Christian Carrillo et.al. | 2307.12970v1 | null |
2023-07-24 | Aligning Large Language Models with Human: A Survey | Yufei Wang et.al. | 2307.12966v1 | link |
2023-07-24 | RLCD: Reinforcement Learning from Contrast Distillation for Language Model Alignment | Kevin Yang et.al. | 2307.12950v1 | link |
2023-07-24 | Boosting Punctuation Restoration with Data Generation and Reinforcement Learning | Viet Dac Lai et.al. | 2307.12949v1 | link |
2023-07-21 | Advancing Ad Auction Realism: Practical Insights & Modeling Implications | Ming Chen et.al. | 2307.11732v1 | null |
2023-07-21 | OUTFOX: LLM-generated Essay Detection through In-context Learning with Adversarially Generated Examples | Ryuto Koike et.al. | 2307.11729v1 | link |
2023-07-21 | Benchmark datasets for biomedical knowledge graphs with negative statements | Rita T. Sousa et.al. | 2307.11719v1 | null |
2023-07-20 | L-Eval: Instituting Standardized Evaluation for Long Context Language Models | Chenxin An et.al. | 2307.11088v1 | link |
2023-07-20 | AlignDet: Aligning Pre-training and Fine-tuning in Object Detection | Ming Li et.al. | 2307.11077v1 | link |
2023-07-20 | OBJECT 3DIT: Language-guided 3D-aware Image Editing | Oscar Michel et.al. | 2307.11073v1 | null |
2023-07-19 | Adversarial Latent Autoencoder with Self-Attention for Structural Image Synthesis | Jiajie Fan et.al. | 2307.10166v1 | null |
2023-07-19 | Rethinking Backdoor Attacks | Alaa Khaddaj et.al. | 2307.10163v1 | null |
2023-07-19 | Robust Driving Policy Learning with Guided Meta Reinforcement Learning | Kanghoon Lee et.al. | 2307.10160v1 | null |
2023-07-19 | FABRIC: Personalizing Diffusion Models with Iterative Feedback | Dimitri von Rütte et.al. | 2307.10159v1 | link |
2023-07-19 | Contact-aware Shaping and Maintenance of Deformable Linear Objects With Fixtures | Kejia Chen et.al. | 2307.10153v1 | null |
2023-07-18 | Forecasting the steam mass flow in a powerplant using the parallel hybrid network | Andrii Kurkin et.al. | 2307.09483v1 | null |
2023-07-18 | AnyDoor: Zero-shot Object-level Image Customization | Xi Chen et.al. | 2307.09481v1 | link |
2023-07-18 | ChatSpot: Bootstrapping Multimodal LLMs via Precise Referring Instruction Tuning | Liang Zhao et.al. | 2307.09474v1 | null |
2023-07-18 | Optimal Vehicle Trajectory Planning for Static Obstacle Avoidance using Nonlinear Optimization | Yajia Zhang et.al. | 2307.09466v1 | null |
2023-07-19 | Does Circuit Analysis Interpretability Scale? Evidence from Multiple Choice Capabilities in Chinchilla | Tom Lieberum et.al. | 2307.09458v2 | null |
2023-07-19 | A comparative analysis of SRGAN models | Fatemeh Rezapoor Nikroo et.al. | 2307.09456v2 | null |
2023-07-18 | Solving Knapsack with Small Items via L0-Proximity | Ce Jin et.al. | 2307.09454v1 | null |
2023-07-17 | Diffusion Models Beat GANs on Image Classification | Soumik Mukhopadhyay et.al. | 2307.08702v1 | null |
2023-07-17 | AlpaGasus: Training A Better Alpaca with Fewer Data | Lichang Chen et.al. | 2307.08701v1 | link |
2023-07-17 | Fast model inference and training on-board of Satellites | Vít Růžička et.al. | 2307.08700v1 | link |
2023-07-17 | Pair then Relation: Pair-Net for Panoptic Scene Graph Generation | Jinghao Wang et.al. | 2307.08699v1 | link |
2023-07-17 | Flow Matching in Latent Space | Quan Dao et.al. | 2307.08698v1 | link |
2023-07-17 | FlashAttention-2: Faster Attention with Better Parallelism and Work Partitioning | Tri Dao et.al. | 2307.08691v1 | link |
2023-07-17 | COLLIE: Systematic Construction of Constrained Text Generation Tasks | Shunyu Yao et.al. | 2307.08689v1 | link |
2023-07-14 | NIFTY: Neural Object Interaction Fields for Guided Human Motion Synthesis | Nilesh Kulkarni et.al. | 2307.07511v1 | null |
2023-07-14 | A Poisson Decomposition for Information and the Information-Event Diagram | Cheuk Ting Li et.al. | 2307.07506v1 | null |
2023-07-14 | Exhaustive Generation of Linear Orthogonal Cellular Automata | Enrico Formenti et.al. | 2307.07505v1 | null |
2023-07-14 | TALL: Thumbnail Layout for Deepfake Video Detection | Yuting Xu et.al. | 2307.07494v1 | link |
2023-07-14 | BehAVExplor: Behavior Diversity Guided Testing for Autonomous Driving Systems | Mingfei Cheng et.al. | 2307.07493v1 | null |
2023-07-14 | PseudoCal: A Source-Free Approach to Unsupervised Uncertainty Calibration in Domain Adaptation | Dapeng Hu et.al. | 2307.07489v1 | null |
2023-07-13 | HyperDreamBooth: HyperNetworks for Fast Personalization of Text-to-Image Models | Nataniel Ruiz et.al. | 2307.06949v1 | null |
2023-07-13 | Self-regulating Prompts: Foundational Model Adaptation without Forgetting | Muhammad Uzair Khattak et.al. | 2307.06948v1 | link |
2023-07-13 | In-context Autoencoder for Context Compression in a Large Language Model | Tao Ge et.al. | 2307.06945v1 | link |
2023-07-13 | InternVid: A Large-scale Video-Text Dataset for Multimodal Understanding and Generation | Yi Wang et.al. | 2307.06942v1 | link |
2023-07-13 | Animate-A-Story: Storytelling with Retrieval-Augmented Video Generation | Yingqing He et.al. | 2307.06940v1 | link |
2023-07-12 | Diagnosis, Feedback, Adaptation: A Human-in-the-Loop Framework for Test-Time Policy Adaptation | Andi Peng et.al. | 2307.06333v1 | null |
2023-07-12 | Deep Learning of Crystalline Defects from TEM images: A Solution for the Problem of “Never Enough Training Data” | Kishan Govind et.al. | 2307.06322v1 | null |
2023-07-12 | Facial Reenactment Through a Personalized Generator | Ariel Elazary et.al. | 2307.06307v1 | null |
2023-07-12 | Locally Adaptive Federated Learning via Stochastic Polyak Stepsizes | Sohom Mukherjee et.al. | 2307.06306v1 | link |
2023-07-11 | Scale Alone Does not Improve Mechanistic Interpretability in Vision Models | Roland S. Zimmermann et.al. | 2307.05471v1 | null |
2023-07-12 | My3DGen: Building Lightweight Personalized 3D Generative Model | Luchao Qi et.al. | 2307.05468v2 | null |
2023-07-11 | EgoVLPv2: Egocentric Video-Language Pre-training with Fusion in the Backbone | Shraman Pramanick et.al. | 2307.05463v1 | link |
2023-07-11 | Efficient 3D Articulated Human Generation with Layered Surface Volumes | Yinghao Xu et.al. | 2307.05462v1 | null |
2023-07-10 | Semantic-SAM: Segment and Recognize Anything at Any Granularity | Feng Li et.al. | 2307.04767v1 | link |
2023-07-10 | Learning Spatial Features from Audio-Visual Correspondence in Egocentric Videos | Sagnik Majumder et.al. | 2307.04760v1 | null |
2023-07-10 | Information decomposition to identify relevant variation in complex systems with machine learning | Kieran A. Murphy et.al. | 2307.04755v1 | link |
2023-07-10 | Shelving, Stacking, Hanging: Relational Pose Diffusion for Multi-modal Rearrangement | Anthony Simeonov et.al. | 2307.04751v1 | null |
2023-07-10 | Divide, Evaluate, and Refine: Evaluating and Improving Text-to-Image Alignment with Iterative VQA Feedback | Jaskirat Singh et.al. | 2307.04749v1 | null |
2023-07-07 | On the Efficacy of Sampling Adapters | Clara Meister et.al. | 2307.03749v1 | link |
2023-07-07 | Comparing Traditional and LLM-based Search for Consumer Choice: A Randomized Experiment | Sofia Eleni Spatharioti et.al. | 2307.03744v1 | null |
2023-07-07 | QIGen: Generating Efficient Kernels for Quantized Inference on Large Language Models | Tommaso Pegolotti et.al. | 2307.03738v1 | link |
2023-07-06 | Simulating Nelsonian Quantum Field Theory | Andrea Carosso et.al. | 2307.03188v1 | null |
2023-07-06 | Whisper-AT: Noise-Robust Automatic Speech Recognizers are Also Strong General Audio Event Taggers | Yuan Gong et.al. | 2307.03183v1 | link |
2023-07-06 | Markov Persuasion Processes with Endogenous Agent Beliefs | Krishnamurthy Iyer et.al. | 2307.03181v1 | null |
2023-07-07 | IPO-LDM: Depth-aided 360-degree Indoor RGB Panorama Outpainting via Latent Diffusion Model | Tianhao Wu et.al. | 2307.03177v2 | null |
2023-07-06 | Push Past Green: Learning to Look Behind Plant Foliage by Moving It | Xiaoyu Zhang et.al. | 2307.03175v1 | null |
2023-07-06 | Risk-Averse Trajectory Optimization via Sample Average Approximation | Thomas Lew et.al. | 2307.03167v1 | link |
2023-07-06 | VideoGLUE: Video General Understanding Evaluation of Foundation Models | Liangzhe Yuan et.al. | 2307.03166v1 | link |
2023-07-05 | LongNet: Scaling Transformers to 1,000,000,000 Tokens | Jiayu Ding et.al. | 2307.02486v1 | link |
2023-07-05 | Elastic Decision Transformer | Yueh-Hua Wu et.al. | 2307.02484v1 | link |
2023-07-05 | Jailbroken: How Does LLM Safety Training Fail? | Alexander Wei et.al. | 2307.02483v1 | null |
2023-07-05 | Reasoning or Reciting? Exploring the Capabilities and Limitations of Language Models Through Counterfactual Tasks | Zhaofeng Wu et.al. | 2307.02477v1 | link |
2023-07-05 | The Calissons Puzzle | Jean-Marie Favreau et.al. | 2307.02475v1 | null |
2023-07-06 | Deductive Additivity for Planning of Natural Language Proofs | Zayne Sprague et.al. | 2307.02472v2 | link |
2023-07-05 | What Matters in Training a GPT4-Style Language Model with Multimodal Inputs? | Yan Zeng et.al. | 2307.02469v1 | null |
2023-07-03 | Real-time Monocular Full-body Capture in World Space via Sequential Proxy-to-Motion Learning | Yuxiang Zhang et.al. | 2307.01200v1 | null |
2023-07-03 | NeuBTF: Neural fields for BTF encoding and transfer | Carlos Rodriguez-Pardo et.al. | 2307.01199v1 | null |
2023-07-03 | Improved sampling via learned diffusions | Lorenz Richter et.al. | 2307.01198v1 | null |
2023-07-03 | Segment Anything Meets Point Tracking | Frano Rajič et.al. | 2307.01197v1 | link |
2023-07-03 | Squeezing Large-Scale Diffusion Models for Mobile | Jiwoong Choi et.al. | 2307.01193v1 | null |
2023-07-03 | SAMAug: Point Prompt Augmentation for Segment Anything Model | Haixing Dai et.al. | 2307.01187v1 | link |
2023-07-03 | Continuously Red-Shift and Blue-Shift Wavelength-Tuneable, Narrowband, High Harmonics in the EUV - X-ray Regime for Resonance Imaging and Spectroscopies | Dimitar Popmintchev et.al. | 2307.01182v1 | null |
2023-06-30 | Hardwiring ViT Patch Selectivity into CNNs using Patch Mixing | Ariel N. Lee et.al. | 2306.17848v1 | null |
2023-06-30 | Magic123: One Image to High-Quality 3D Object Generation Using Both 2D and 3D Diffusion Priors | Guocheng Qian et.al. | 2306.17843v1 | link |
2023-07-03 | SPAE: Semantic Pyramid AutoEncoder for Multimodal Generation with Frozen LLMs | Lijun Yu et.al. | 2306.17842v2 | link |
2023-07-03 | Statler: State-Maintaining Language Models for Embodied Reasoning | Takuma Yoneda et.al. | 2306.17840v2 | null |
2023-06-30 | Federated Ensemble YOLOv5 - A Better Generalized Object Detection Algorithm | Vinit Hegiste et.al. | 2306.17829v1 | null |
2023-06-30 | Understanding Unfairness via Training Concept Influence | Yuanshun Yao et.al. | 2306.17828v1 | null |
2023-06-29 | An Efficient General-Purpose Modular Vision Model via Multi-Task Heterogeneous Training | Zitian Chen et.al. | 2306.17165v1 | null |
2023-06-30 | Generative AI for Programming Education: Benchmarking ChatGPT, GPT-4, and Human Tutors | Tung Phung et.al. | 2306.17156v2 | null |
2023-06-29 | Generate Anything Anywhere in Any Scene | Yuheng Li et.al. | 2306.17154v1 | null |
2023-06-28 | MultiZoo & MultiBench: A Standardized Toolkit for Multimodal Deep Learning | Paul Pu Liang et.al. | 2306.16413v1 | link |
2023-06-29 | Even order contributions to relative energies vanish for antisymmetric perturbations | O. Anatole von Lilienfeld et.al. | 2306.16409v2 | null |
2023-06-27 | Physion++: Evaluating Physical Scene Understanding that Requires Online Inference of Different Physical Properties | Hsiao-Yu Tung et.al. | 2306.15668v1 | null |
2023-06-28 | PoseDiffusion: Solving Pose Estimation via Diffusion-aided Bundle Adjustment | Jianyuan Wang et.al. | 2306.15667v2 | null |
2023-06-27 | SparseOptimizer: Sparsify Language Models through Moreau-Yosida Regularization and Accelerate through Compiler Co-design | Fu-Ming Guo et.al. | 2306.15656v1 | null |
2023-06-27 | Optimal Area-Sensitive Bounds for Polytope Approximation | Sunil Arya et.al. | 2306.15648v1 | null |
2023-06-26 | FunQA: Towards Surprising Video Comprehension | Binzhu Xie et.al. | 2306.14899v1 | link |
2023-06-27 | InterCode: Standardizing and Benchmarking Interactive Coding with Execution Feedback | John Yang et.al. | 2306.14898v2 | link |
2023-06-26 | Supervised Pretraining Can Learn In-Context Reinforcement Learning | Jonathan N. Lee et.al. | 2306.14892v1 | null |
2023-06-26 | Value of Information in Games with Multiple Strategic Information Providers | Raj Kiriti Velicheti et.al. | 2306.14886v1 | null |
2023-06-26 | Restart Sampling for Improving Generative Processes | Yilun Xu et.al. | 2306.14878v1 | link |
2023-06-26 | Geometry-Aware Approaches for Balancing Performance and Theoretical Guarantees in Linear Bandits | Yuwei Luo et.al. | 2306.14872v1 | null |
2023-06-26 | Composing Parameter-Efficient Modules with Arithmetic Operations | Jinghan Zhang et.al. | 2306.14870v1 | link |
2023-06-23 | GKD: Generalized Knowledge Distillation for Auto-regressive Sequence Models | Rishabh Agarwal et.al. | 2306.13649v1 | null |
2023-06-23 | Offline Skill Graph (OSG): A Framework for Learning and Planning using Offline Reinforcement Learning Skills | Ben-ya Halevy et.al. | 2306.13630v1 | null |
2023-06-22 | Evading Forensic Classifiers with Attribute-Conditioned Adversarial Faces | Fahad Shamshad et.al. | 2306.13091v1 | link |
2023-06-22 | PromptIR: Prompting for All-in-One Blind Image Restoration | Vaishnav Potlapalli et.al. | 2306.13090v1 | link |
2023-06-22 | Improved Signal Detection for Ambient Backscatter Communications | S. Zargari et.al. | 2306.13083v1 | null |
2023-06-21 | VisoGender: A dataset for benchmarking gender bias in image-text pronoun resolution | Siobhan Mackenzie Hall et.al. | 2306.12424v1 | link |
2023-06-21 | Benchmarking and Analyzing 3D-aware Image Synthesis with a Modularized Codebase | Qiuyu Wang et.al. | 2306.12423v1 | link |
2023-06-21 | LMFlow: An Extensible Toolkit for Finetuning and Inference of Large Foundation Models | Shizhe Diao et.al. | 2306.12420v1 | link |
2023-06-21 | Coqlex: Generating Formally Verified Lexers | Wendlasida Ouedraogo et.al. | 2306.12411v1 | null |
2023-06-20 | Learning Profitable NFT Image Diffusions via Multiple Visual-Policy Guided Reinforcement Learning | Huiguo He et.al. | 2306.11731v1 | null |
2023-06-20 | Dense Video Object Captioning from Disjoint Supervision | Xingyi Zhou et.al. | 2306.11729v1 | link |
2023-06-20 | Diffusion with Forward Models: Solving Stochastic Inverse Problems Without Direct Supervision | Ayush Tewari et.al. | 2306.11719v1 | null |
2023-06-20 | Multi-Fidelity Active Learning with GFlowNets | Alex Hernandez-Garcia et.al. | 2306.11715v1 | link |
2023-06-20 | Data-Driven but Privacy-Conscious: Pedestrian Dataset De-identification via Full-Body Person Synthesis | Maxim Maximov et.al. | 2306.11710v1 | null |
2023-06-16 | Just One Byte (per gradient): A Note on Low-Bandwidth Decentralized Language Model Finetuning Using Shared Randomness | Eric Zelikman et.al. | 2306.10015v1 | link |
2023-06-20 | CLIP2Protect: Protecting Facial Privacy using Text-Guided Makeup via Adversarial Latent Search | Fahad Shamshad et.al. | 2306.10008v2 | link |
2023-06-16 | C2F2NeUS: Cascade Cost Frustum Fusion for High Fidelity and Generalizable Neural Surface Reconstruction | Luoyuan Xu et.al. | 2306.10003v1 | null |
2023-06-16 | SLACK: Stable Learning of Augmentations with Cold-start and KL regularization | Juliette Marrie et.al. | 2306.09998v1 | null |
2023-06-16 | Fairness in Preference-based Reinforcement Learning | Umer Siddique et.al. | 2306.09995v1 | null |
2023-06-16 | Rosetta Neurons: Mining the Common Units in a Model Zoo | Amil Dravid et.al. | 2306.09346v2 | null |
2023-06-15 | Evaluating Data Attribution for Text-to-Image Models | Sheng-Yu Wang et.al. | 2306.09345v1 | link |
2023-06-15 | DreamSim: Learning New Dimensions of Human Visual Similarity using Synthetic Data | Stephanie Fu et.al. | 2306.09344v1 | link |
2023-06-15 | Human Preference Score v2: A Solid Benchmark for Evaluating Human Preferences of Text-to-Image Synthesis | Xiaoshi Wu et.al. | 2306.09341v1 | link |
2023-06-15 | Span-Selective Linear Attention Transformers for Effective and Robust Schema-Guided Dialogue State Tracking | Björn Bebensee et.al. | 2306.09340v1 | null |
2023-06-15 | From BERT to GPT-3 Codex: Harnessing the Potential of Very Large Language Models for Data Management | Immanuel Trummer et.al. | 2306.09339v1 | null |
2023-06-15 | Generative Proxemics: A Prior for 3D Social Interaction from Images | Lea Müller et.al. | 2306.09337v1 | link |
2023-06-15 | Fit Like You Sample: Sample-Efficient Generalized Score Matching from Fast Mixing Markov Chains | Yilong Qin et.al. | 2306.09332v1 | null |
2023-06-15 | ArtFusion: Arbitrary Style Transfer using Dual Conditional Latent Diffusion Models | Dar-Yen Chen et.al. | 2306.09330v1 | link |
2023-06-13 | XrayGPT: Chest Radiographs Summarization using Medical Vision-Language Models | Omkar Thawkar et.al. | 2306.07971v1 | link |
2023-06-13 | GeneCIS: A Benchmark for General Conditional Image Similarity | Sagar Vaze et.al. | 2306.07969v1 | null |
2023-06-13 | One-for-All: Generalized LoRA for Parameter-Efficient Fine-tuning | Arnav Chavan et.al. | 2306.07967v1 | link |
2023-06-13 | Rerender A Video: Zero-Shot Text-Guided Video-to-Video Translation | Shuai Yang et.al. | 2306.07954v1 | null |
2023-06-12 | Waffling around for Performance: Visual Classification with Random Words and Broad Concepts | Karsten Roth et.al. | 2306.07282v1 | link |
2023-06-12 | Controlling Text-to-Image Diffusion by Orthogonal Finetuning | Zeju Qiu et.al. | 2306.07280v1 | null |
2023-06-12 | Scalable 3D Captioning with Pretrained Models | Tiange Luo et.al. | 2306.07279v1 | link |
2023-06-12 | Mathematical conjecture generation using machine intelligence | Challenger Mishra et.al. | 2306.07277v1 | null |
2023-06-12 | Operator Learning with Neural Fields: Tackling PDEs on General Geometries | Louis Serrano et.al. | 2306.07266v1 | link |
2023-06-12 | On the Collocated Form with Input Decoupling of Lagrangian Systems | Pietro Pustina et.al. | 2306.07258v1 | null |
2023-06-09 | Leveraging Large Language Models for Scalable Vector Graphics-Driven Image Understanding | Mu Cai et.al. | 2306.06094v1 | null |
2023-06-09 | HyP-NeRF: Learning Improved NeRF Priors using a HyperNetwork | Bipasha Sen et.al. | 2306.06093v1 | null |
2023-06-09 | Computational Flash Photography through Intrinsics | Sepideh Sarajian Maralan et.al. | 2306.06089v1 | null |
2023-06-09 | SENS: Sketch-based Implicit Neural Shape Modeling | Alexandre Binninger et.al. | 2306.06088v1 | null |
2023-06-09 | Learning Not to Spoof | David Byrd et.al. | 2306.06087v1 | null |
2023-06-09 | Developing Speech Processing Pipelines for Police Accountability | Anjalie Field et.al. | 2306.06086v1 | null |
2023-06-08 | Background Prompting for Improved Object Depth | Manel Baradad et.al. | 2306.05428v1 | null |
2023-06-08 | Grounded Text-to-Image Synthesis with Attention Refocusing | Quynh Phung et.al. | 2306.05427v1 | null |
2023-06-08 | SequenceMatch: Imitation Learning for Autoregressive Sequence Modelling with Backtracking | Chris Cundy et.al. | 2306.05426v1 | null |
2023-06-08 | MIMIC-IT: Multi-Modal In-Context Instruction Tuning | Bo Li et.al. | 2306.05425v1 | link |
2023-06-08 | Video-ChatGPT: Towards Detailed Video Understanding via Large Vision and Language Models | Muhammad Maaz et.al. | 2306.05424v1 | link |
2023-06-08 | ADDP: Learning General Representations for Image Recognition and Generation with Alternating Denoising Diffusion Process | Changyao Tian et.al. | 2306.05423v1 | null |
2023-06-08 | Stochastic Multi-Person 3D Motion Forecasting | Sirui Xu et.al. | 2306.05421v1 | link |
2023-06-08 | Scaling Spherical CNNs | Carlos Esteves et.al. | 2306.05420v1 | link |
2023-06-08 | 2D Supervised Monocular 3D Object Detection by Global-to-Local 3D Reconstruction | Jiawei He et.al. | 2306.05418v1 | null |
2023-06-07 | Transformers as Statisticians: Provable In-Context Learning with In-Context Algorithm Selection | Yu Bai et.al. | 2306.04637v1 | link |
2023-06-07 | GP-UNIT: Generative Prior for Versatile Unsupervised Image-to-Image Translation | Shuai Yang et.al. | 2306.04636v1 | link |
2023-06-07 | On the Reliability of Watermarks for Large Language Models | John Kirchenbauer et.al. | 2306.04634v1 | link |
2023-06-07 | Designing a Better Asymmetric VQGAN for StableDiffusion | Zixin Zhu et.al. | 2306.04632v1 | link |
2023-06-07 | Goal-conditioned GFlowNets for Controllable Multi-Objective Molecular Design | Julien Roy et.al. | 2306.04620v1 | null |
2023-06-07 | Helicity-dependent optical control of the magnetization state emerging from the Landau-Lifshitz-Gilbert equation | Benjamin Assouline et.al. | 2306.04617v1 | null |
2023-06-07 | ChatDB: Augmenting LLMs with Databases as Their Symbolic Memory | Chenxu Hu et.al. | 2306.03901v2 | null |
2023-06-06 | Model Spider: Learning to Rank Pre-Trained Models Efficiently | Yi-Kai Zhang et.al. | 2306.03900v1 | null |
2023-06-06 | Towards Label-free Scene Understanding by Vision Foundation Models | Runnan Chen et.al. | 2306.03899v1 | link |
2023-06-05 | Is ChatGPT a Good Teacher Coach? Measuring Zero-Shot Performance For Scoring and Providing Actionable Insights on Classroom Instruction | Rose E. Wang et.al. | 2306.03090v1 | link |
2023-06-05 | Brain Diffusion for Visual Exploration: Cortical Discovery using Large Scale Generative Models | Andrew F. Luo et.al. | 2306.03089v1 | null |
2023-06-05 | DeepGraphDMD: Interpretable Spatio-Temporal Decomposition of Non-linear Functional Brain Network Dynamics | Md Asadullah Turja et.al. | 2306.03088v1 | link |
2023-06-05 | MotionDiffuser: Controllable Multi-Agent Motion Prediction using Diffusion | Chiyu Max Jiang et.al. | 2306.03083v1 | null |
2023-06-05 | InstructZero: Efficient Instruction Optimization for Black-Box Large Language Models | Lichang Chen et.al. | 2306.03082v1 | link |
2023-06-05 | Sequential Monte Carlo Steering of Large Language Models using Probabilistic Programs | Alexander K. Lew et.al. | 2306.03081v1 | link |
2023-06-05 | A General Perspective on Objectives of Reinforcement Learning | Long Yang et.al. | 2306.03074v1 | null |
2023-06-05 | Explore to Generalize in Zero-Shot RL | Ev Zisselman et.al. | 2306.03072v1 | link |
2023-06-02 | Multilingual Conceptual Coverage in Text-to-Image Models | Michael Saxon et.al. | 2306.01735v1 | link |
2023-06-02 | DocFormerv2: Local Features for Document Understanding | Srikar Appalaraju et.al. | 2306.01733v1 | null |
2023-06-02 | Video Colorization with Pre-trained Text-to-Image Diffusion Models | Hanyuan Liu et.al. | 2306.01732v1 | null |
2023-06-02 | Improving Generalization in Task-oriented Dialogues with Workflows and Action Plans | Stefania Raimondo et.al. | 2306.01729v1 | null |
2023-06-02 | Denoising Diffusion Semantic Segmentation with Mask Prior Modeling | Zeqiang Lai et.al. | 2306.01721v1 | link |
2023-06-02 | Fresh Content Needs More Attention: Multi-funnel Fresh Content Recommendation | Jianling Wang et.al. | 2306.01720v1 | null |
2023-06-02 | Discreteness of asymptotic tensor ranks | Jop Briët et.al. | 2306.01718v1 | null |
2023-06-01 | StyleGAN knows Normal, Depth, Albedo, and More | Anand Bhattad et.al. | 2306.00987v1 | null |
2023-06-02 | Diffusion Self-Guidance for Controllable Image Generation | Dave Epstein et.al. | 2306.00986v2 | null |
2023-06-01 | StableRep: Synthetic Images from Text-to-Image Models Make Strong Visual Representation Learners | Yonglong Tian et.al. | 2306.00984v1 | link |
2023-06-01 | StyleDrop: Text-to-Image Generation in Any Style | Kihyuk Sohn et.al. | 2306.00983v1 | null |
2023-06-01 | SnapFusion: Text-to-Image Diffusion Model on Mobile Devices within Two Seconds | Yanyu Li et.al. | 2306.00980v1 | link |
2023-06-01 | AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration | Ji Lin et.al. | 2306.00978v1 | link |
2023-06-01 | Intriguing Properties of Text-guided Diffusion Models | Qihao Liu et.al. | 2306.00974v1 | link |
2023-06-01 | Intelligent Grimm – Open-ended Visual Storytelling via Latent Diffusion Models | Chang Liu et.al. | 2306.00973v1 | link |
2023-06-01 | Too Large; Data Reduction for Vision-Language Pre-Training | Alex Jinpeng Wang et.al. | 2305.20087v2 | link |
2023-05-31 | Understanding and Mitigating Copying in Diffusion Models | Gowthami Somepalli et.al. | 2305.20086v1 | link |
2023-05-31 | Control4D: Dynamic Portrait Editing by Learning 4D GAN from 2D Diffusion-based Editor | Ruizhi Shao et.al. | 2305.20082v1 | null |
2023-05-31 | On the Capacity of Secure $K$ -user Product Computation over a Quantum MAC | Yuxiang Lu et.al. | 2305.20073v1 | null |
2023-05-31 | Latent Exploration for Reinforcement Learning | Alberto Silvio Chiappa et.al. | 2305.20065v1 | link |
2023-05-31 | Chatting Makes Perfect – Chat-based Image Retrieval | Matan Levy et.al. | 2305.20062v1 | link |
2023-05-30 | Concise Answers to Complex Questions: Summarization of Long-form Answers | Abhilash Potluri et.al. | 2305.19271v1 | link |
2023-05-30 | Microfluidics Generation of Millimeter-sized Matrigel Droplets | Cory Arnold et.al. | 2305.19261v1 | null |
2023-05-30 | Shuffle SGD is Always Better than SGD: Improved Analysis of SGD with Arbitrary Data Orders | Anastasia Koloskova et.al. | 2305.19259v1 | null |
2023-05-30 | Ambient Diffusion: Learning Clean Distributions from Corrupted Data | Giannis Daras et.al. | 2305.19256v1 | link |
2023-05-30 | What Can We Learn from Unlearnable Datasets? | Pedro Sandoval-Segura et.al. | 2305.19254v1 | link |
2023-05-29 | RAPHAEL: Text-to-Image Generation via Large Mixture of Diffusion Paths | Zeyue Xue et.al. | 2305.18295v1 | null |
2023-05-29 | Transformer Language Models Handle Word Frequency in Prediction Head | Goro Kobayashi et.al. | 2305.18294v1 | null |
2023-05-29 | Direct Preference Optimization: Your Language Model is Secretly a Reward Model | Rafael Rafailov et.al. | 2305.18290v1 | link |
2023-05-29 | LaFTer: Label-Free Tuning of Zero-shot Classifier using Language and Unlabeled Image Collections | M. Jehanzeb Mirza et.al. | 2305.18287v1 | null |
2023-05-29 | Characterization and evasion of backscattered light in the squeezed-light enhanced gravitational wave interferometer GEO 600 | Fabio Bergamin et.al. | 2305.18284v1 | null |
2023-05-29 | Contextual Object Detection with Multimodal Large Language Models | Yuhang Zang et.al. | 2305.18279v1 | link |
2023-05-26 | NeuManifold: Neural Watertight Manifold Reconstruction with Efficient and High-Quality Rendering Support | Xinyue Wei et.al. | 2305.17134v1 | null |
2023-05-26 | RAMP: Retrieval and Attribute-Marking Enhanced Prompting for Attribute-Controlled Translation | Gabriele Sarti et.al. | 2305.17131v1 | null |
2023-05-26 | Characterizing and Measuring Linguistic Dataset Drift | Tyler A. Chang et.al. | 2305.17127v1 | link |
2023-05-26 | Large Language Models as Tool Makers | Tianle Cai et.al. | 2305.17126v1 | link |
2023-05-26 | Manifold Regularization for Memory-Efficient Training of Deep Neural Networks | Shadi Sartipi et.al. | 2305.17119v1 | null |
2023-05-26 | Scissorhands: Exploiting the Persistence of Importance Hypothesis for LLM KV Cache Compression at Test Time | Zichang Liu et.al. | 2305.17118v1 | null |
2023-05-26 | Improving accuracy of GPT-3/4 results on biomedical data using a retrieval-augmented language model | David Soong et.al. | 2305.17116v1 | null |
2023-05-25 | Uni-ControlNet: All-in-One Control to Text-to-Image Diffusion Models | Shihao Zhao et.al. | 2305.16322v1 | link |
2023-05-25 | Parallel Sampling of Diffusion Models | Andy Shih et.al. | 2305.16317v1 | link |
2023-05-25 | NAP: Neural 3D Articulation Prior | Jiahui Lei et.al. | 2305.16315v1 | null |
2023-05-26 | Banana: Banach Fixed-Point Network for Pointcloud Segmentation with Inter-Part Equivariance | Congyue Deng et.al. | 2305.16314v2 | null |
2023-05-25 | UMat: Uncertainty-Aware Single Image High Resolution Material Capture | Carlos Rodriguez-Pardo et.al. | 2305.16312v1 | null |
2023-05-25 | Break-A-Scene: Extracting Multiple Concepts from a Single Image | Omri Avrahami et.al. | 2305.16311v1 | link |
2023-05-25 | Securing Deep Generative Models with Universal Adversarial Signature | Yu Zeng et.al. | 2305.16310v1 | link |
2023-05-25 | Imitating Task and Motion Planning with Visuomotor Transformers | Murtaza Dalal et.al. | 2305.16309v1 | null |
2023-05-25 | Fine-Grained Complexity Analysis of Multi-Agent Path Finding on 2D Grids | Tzvika Geft et.al. | 2305.16303v1 | null |
2023-05-24 | Towards Revealing the Mystery behind Chain of Thought: a Theoretical Perspective | Guhao Feng et.al. | 2305.15408v1 | link |
2023-05-24 | Balancing the Picture: Debiasing Vision-Language Datasets with Synthetic Contrast Sets | Brandon Smith et.al. | 2305.15407v1 | link |
2023-05-24 | Sin3DM: Learning a Diffusion Model from a Single 3D Textured Shape | Rundi Wu et.al. | 2305.15399v1 | link |
2023-05-24 | LayoutGPT: Compositional Visual Planning and Generation with Large Language Models | Weixi Feng et.al. | 2305.15393v1 | link |
2023-05-24 | A Neural Space-Time Representation for Text-to-Image Personalization | Yuval Alaluf et.al. | 2305.15391v1 | link |
2023-05-24 | Peek Across: Improving Multi-Document Modeling via Cross-Document Question-Answering | Avi Caciularu et.al. | 2305.15387v1 | link |
2023-05-23 | NCHO: Unsupervised Learning for Neural 3D Composition of Humans and Objects | Taeksoo Kim et.al. | 2305.14345v1 | link |
2023-05-23 | Video Prediction Models as Rewards for Reinforcement Learning | Alejandro Escontrela et.al. | 2305.14343v1 | null |
2023-05-23 | APPLS: A Meta-evaluation Testbed for Plain Language Summarization | Yue Guo et.al. | 2305.14341v1 | link |
2023-05-23 | Diffusion Hyperfeatures: Searching Through Time and Space for Semantic Correspondence | Grace Luo et.al. | 2305.14334v1 | null |
2023-05-23 | Evaluating and Modeling Attribution for Cross-Lingual Question Answering | Benjamin Muller et.al. | 2305.14332v1 | null |
2023-05-23 | Large Language Models are Frame-level Directors for Zero-shot Text-to-Video Generation | Susung Hong et.al. | 2305.14330v1 | link |
2023-05-23 | Zero-sum Polymatrix Markov Games: Equilibrium Collapse and Efficient Computation of Nash Equilibria | Fivos Kalogiannis et.al. | 2305.14329v1 | null |
2023-05-23 | Dynosaur: A Dynamic Growth Paradigm for Instruction-Tuning Data Curation | Da Yin et.al. | 2305.14327v1 | link |
2023-05-22 | Contextualising Implicit Representations for Semantic Tasks | Theo W. Costain et.al. | 2305.13312v1 | null |
2023-05-22 | VDT: An Empirical Study on Video Diffusion with Transformers | Haoyu Lu et.al. | 2305.13311v1 | link |
2023-05-22 | Matcher: Segment Anything with One Shot Using All-Purpose Feature Matching | Yang Liu et.al. | 2305.13310v1 | link |
2023-05-22 | Evaluating Factual Consistency of Texts with Semantic Role Labeling | Jing Fan et.al. | 2305.13309v1 | link |
2023-05-22 | If at First You Don’t Succeed, Try, Try Again: Faithful Diffusion-based Text-to-Image Generation by Selection | Shyamgopal Karthik et.al. | 2305.13308v1 | link |
2023-05-22 | NeRFuser: Large-Scale Scene Representation by NeRF Fusion | Jiading Fang et.al. | 2305.13307v1 | link |
2023-05-22 | Growth of ultrawide-bandgap BN/diamond heterostructures by pulsed laser deposition | Abhijit Biswas et.al. | 2305.13306v1 | null |
2023-05-22 | RecurrentGPT: Interactive Generation of (Arbitrarily) Long Text | Wangchunshu Zhou et.al. | 2305.13304v1 | link |
2023-05-23 | Training Diffusion Models with Reinforcement Learning | Kevin Black et.al. | 2305.13301v2 | link |
2023-05-22 | Measuring Inductive Biases of In-Context Learning with Underspecified Demonstrations | Chenglei Si et.al. | 2305.13299v1 | link |
2023-05-19 | Chupa: Carving 3D Clothed Humans from Skinned Shape Priors using 2D Diffusion Probabilistic Models | Byungjun Kim et.al. | 2305.11870v1 | link |
2023-05-19 | Reducing Sequence Length by Predicting Edit Operations with Large Language Models | Masahiro Kaneko et.al. | 2305.11862v1 | null |
2023-05-19 | Video Killed the HD-Map: Predicting Driving Behavior Directly From Drone Images | Yunpeng Liu et.al. | 2305.11856v1 | null |
2023-05-19 | Multimodal Web Navigation with Instruction-Finetuned Foundation Models | Hiroki Furuta et.al. | 2305.11854v1 | null |
2023-05-19 | Poincare and Einstein on Mass-Energy Equivalence: A Modern Perspective on their 1900 and 1905 Papers | Patrick Moylan et.al. | 2305.11852v1 | null |
2023-05-19 | Any-to-Any Generation via Composable Diffusion | Zineng Tang et.al. | 2305.11846v1 | link |
2023-05-18 | Instruct2Act: Mapping Multi-modality Instructions to Robotic Actions with Large Language Model | Siyuan Huang et.al. | 2305.11176v1 | link |
2023-05-18 | VisionLLM: Large Language Model is also an Open-Ended Decoder for Vision-Centric Tasks | Wenhai Wang et.al. | 2305.11175v1 | link |
2023-05-18 | Going Denser with Open-Vocabulary Part Segmentation | Peize Sun et.al. | 2305.11173v1 | link |
2023-05-18 | ONE-PEACE: Exploring One General Representation Model Toward Unlimited Modalities | Peng Wang et.al. | 2305.11172v1 | link |
2023-05-18 | TrueTeacher: Learning Factual Consistency Evaluation with Large Language Models | Zorik Gekhman et.al. | 2305.11171v1 | link |
2023-05-18 | Efficient Prompting via Dynamic In-Context Learning | Wangchunshu Zhou et.al. | 2305.11170v1 | null |
2023-05-18 | Evidence of Meaning in Language Models Trained on Programs | Charles Jin et.al. | 2305.11169v1 | null |
2023-05-17 | FastComposer: Tuning-Free Multi-Subject Image Generation with Localized Attention | Guangxuan Xiao et.al. | 2305.10431v1 | link |
2023-05-17 | CLIP-GCD: Simple Language Guided Generalized Category Discovery | Rabah Ouldnoughi et.al. | 2305.10420v1 | null |
2023-05-17 | Towards Multi-Layered 3D Garments Animation | Yidi Shao et.al. | 2305.10418v1 | null |
2023-05-17 | Scratch Copilot Evaluation: Assessing AI-Assisted Creative Coding for Families | Stefania Druga et.al. | 2305.10417v1 | null |
2023-05-18 | PMC-VQA: Visual Instruction Tuning for Medical Visual Question Answering | Xiaoman Zhang et.al. | 2305.10415v2 | link |
2023-05-17 | AI Friends: A Design Framework for AI-Powered Creative Programming for Youth | Stefania Druga et.al. | 2305.10412v1 | null |
2023-05-17 | Data Extraction via Semantic Regular Expression Synthesis | Qiaochu Chen et.al. | 2305.10401v1 | null |
2023-05-16 | Understanding 3D Object Interaction from a Single Image | Shengyi Qian et.al. | 2305.09664v1 | link |
2023-05-16 | Make-An-Animation: Large-Scale Text-conditional 3D Human Motion Generation | Samaneh Azadi et.al. | 2305.09662v1 | null |
2023-05-16 | Double Pessimism is Provably Efficient for Distributionally Robust Offline Reinforcement Learning: Generic Algorithm and Robust Partial Coverage | Jose Blanchet et.al. | 2305.09659v1 | null |
2023-05-16 | Newad: A register map automation tool for Verilog | Vamsi K Vytla et.al. | 2305.09657v1 | null |
2023-05-17 | Satisfiability-Aided Language Models Using Declarative Prompting | Xi Ye et.al. | 2305.09656v2 | link |
2023-05-16 | Tailoring Instructions to Student’s Learning Levels Boosts Knowledge Distillation | Yuxin Ren et.al. | 2305.09651v1 | link |
2023-05-16 | Wavelet-based Unsupervised Label-to-Image Translation | George Eskandar et.al. | 2305.09647v1 | link |
2023-05-15 | Laughing Matters: Introducing Laughing-Face Generation using Diffusion Models | Antoni Bigata Casademunt et.al. | 2305.08854v1 | link |
2023-05-15 | CQE: A Comprehensive Quantity Extractor | Satya Almasian et.al. | 2305.08853v1 | link |
2023-05-15 | MV-Map: Offboard HD-Map Generation with Multi-view Consistency | Ziyang Xie et.al. | 2305.08851v1 | link |
2023-05-15 | Make-A-Protagonist: Generic Video Editing with An Ensemble of Experts | Yuyang Zhao et.al. | 2305.08850v1 | null |
2023-05-15 | Privacy Auditing with One (1) Training Run | Thomas Steinke et.al. | 2305.08846v1 | null |
2023-05-15 | Large Language Models are Zero-Shot Rankers for Recommender Systems | Yupeng Hou et.al. | 2305.08845v1 | link |
2023-05-15 | RL4F: Generating Natural Language Feedback with Reinforcement Learning for Repairing Model Outputs | Afra Feyza Akyürek et.al. | 2305.08844v1 | link |
2023-05-15 | Straightening Out the Straight-Through Estimator: Overcoming Optimization Challenges in Vector Quantized Networks | Minyoung Huh et.al. | 2305.08842v1 | null |
2023-05-15 | Attacking Perceptual Similarity Metrics | Abhijay Ghildyal et.al. | 2305.08840v1 | null |
2023-05-12 | Text2Cohort: Democratizing the NCI Imaging Data Commons with Natural Language Cohort Discovery | Pranav Kulkarni et.al. | 2305.07637v1 | link |
2023-05-12 | Development of MC/DC: a performant, scalable, and portable Python-based Monte Carlo neutron transport code | Ilham Variansyah et.al. | 2305.07636v1 | link |
2023-05-12 | Zero-shot Item-based Recommendation via Multi-task Product Knowledge Graph Pre-Training | Ziwei Fan et.al. | 2305.07633v1 | null |
2023-05-12 | Design, Development, and Evaluation of an Interactive Personalized Social Robot to Monitor and Coach Post-Stroke Rehabilitation Exercises | Min Hun Lee et.al. | 2305.07632v1 | null |
2023-05-11 | SparseGNV: Generating Novel Views of Indoor Scenes with Sparse Input Views | Weihao Cheng et.al. | 2305.07024v1 | link |
2023-05-11 | Simple Token-Level Confidence Improves Caption Correctness | Suzanne Petryk et.al. | 2305.07021v1 | null |
2023-05-11 | A General-Purpose Multilingual Document Encoder | Onur Galoğlu et.al. | 2305.07016v1 | link |
2023-05-11 | Exploiting Diffusion Prior for Real-World Image Super-Resolution | Jianyi Wang et.al. | 2305.07015v1 | link |
2023-05-11 | Occam’s razor for AI: Coarse-graining Hammett Inspired Product Ansatz in Chemical Space | Marco Bragato et.al. | 2305.07010v1 | null |
2023-05-11 | Fair Price Discrimination | Siddhartha Banerjee et.al. | 2305.07006v1 | null |
2023-05-11 | Subword Segmental Machine Translation: Unifying Segmentation and Target Sentence Generation | Francois Meyer et.al. | 2305.07005v1 | link |
2023-05-11 | Not All Languages Are Created Equal in LLMs: Improving Multilingual Capability by Cross-Lingual-Thought Prompting | Haoyang Huang et.al. | 2305.07004v1 | null |
2023-05-11 | Real-time Manipulation of Liquid Droplets using Photo-responsive Surfactant | Xichen Liang et.al. | 2305.07002v1 | null |
2023-05-10 | Generalizations and Extensions to Lifting Constructions for Coded Caching | V. R. Aravind et.al. | 2305.06352v1 | null |
2023-05-10 | RECKONING: Reasoning through Dynamic Knowledge Encoding | Zeming Chen et.al. | 2305.06349v1 | link |
2023-05-10 | Frequency-Supported Neural Networks for Nonlinear Dynamical System Identification | Krzysztof Zając et.al. | 2305.06344v1 | link |
2023-05-10 | Incorporating Structured Representations into Pretrained Vision & Language Models Using Scene Graphs | Roei Herzig et.al. | 2305.06343v1 | null |
2023-05-10 | Generalized Stratified Sampling for Efficient Reliability Assessment of Structures Against Natural Hazards | Srinivasan Arunachalam et.al. | 2305.06338v1 | null |
2023-05-10 | K-UniMorph: Korean Universal Morphology and its Feature Schema | Eunkyul Leah Jo et.al. | 2305.06335v1 | link |
2023-05-10 | Direct-Laser-Written Polymer Nanowire Waveguides for Broadband Single Photon Collection from Epitaxial Quantum Dots into a Gaussian-like Mode | Edgar Perez et.al. | 2305.06333v1 | null |
2023-05-09 | Policy Gradient Methods in the Presence of Symmetries and State Abstractions | Prakash Panangaden et.al. | 2305.05666v1 | link |
2023-05-09 | ImageBind: One Embedding Space To Bind Them All | Rohit Girdhar et.al. | 2305.05665v1 | link |
2023-05-10 | InternChat: Solving Vision-Centric Tasks by Interacting with Chatbots Beyond Language | Zhaoyang Liu et.al. | 2305.05662v2 | link |
2023-05-09 | TidyBot: Personalized Robot Assistance with Large Language Models | Jimmy Wu et.al. | 2305.05658v1 | link |
2023-05-09 | Using Knowledge Units of Programming Languages to Recommend Reviewers for Pull Requests: An Empirical Study | Md Ahasanuzzaman et.al. | 2305.05654v1 | null |
2023-05-09 | Asymmetric $X$-Secure $T$ -Private Information Retrieval: More Databases is Not Always Better | Mohamed Nomeir et.al. | 2305.05649v1 | null |
2023-05-08 | Learning to Evaluate the Artness of AI-generated Images | Junyu Chen et.al. | 2305.04923v1 | null |
2023-05-08 | DiffuseStyleGesture: Stylized Audio-Driven Co-Speech Gesture Generation with Diffusion Models | Sicheng Yang et.al. | 2305.04919v1 | link |
2023-05-08 | What Do Patients Say About Their Disease Symptoms? Deep Multilabel Text Classification With Human-in-the-Loop Curation for Automatic Labeling of Patient Self Reports of Problems | Lakshmi Arbatti et.al. | 2305.04905v1 | null |
2023-05-08 | Robust Positivity Problems for low-order Linear Recurrence Sequences | Mihir Vahanwala et.al. | 2305.04870v1 | null |
2023-05-05 | On the Benefits of Semi-Supervised Test Case Generation for Cyber-Physical Systems | Xiao Ling et.al. | 2305.03714v1 | null |
2023-05-05 | Avatar Fingerprinting for Authorized Use of Synthetic Talking-Head Videos | Ekta Prashnani et.al. | 2305.03713v1 | null |
2023-05-08 | On the characterization of the convective heat flux in turbulent Rayleigh-Bénard convection | Bérengère Podvin et.al. | 2305.03708v2 | null |
2023-05-05 | LMEye: An Interactive Perception Network for Large Language Models | Yunxin Li et.al. | 2305.03701v1 | link |
2023-05-05 | Vera: A General-Purpose Plausibility Estimation Model for Commonsense Statements | Jiacheng Liu et.al. | 2305.03695v1 | link |
2023-05-05 | Mining bias-target Alignment from Voronoi Cells | Rémi Nahon et.al. | 2305.03691v1 | link |
2023-05-05 | COLA: How to adapt vision-language models to Compose Objects Localized with Attributes? | Arijit Ray et.al. | 2305.03689v1 | link |
2023-05-04 | ZipIt! Merging Models from Different Tasks without Training | George Stoica et.al. | 2305.03053v1 | link |
2023-05-04 | Controllable Visual-Tactile Synthesis | Ruihan Gao et.al. | 2305.03051v1 | link |
2023-05-04 | NeuralEditor: Editing Neural Radiance Fields via Manipulating Point Clouds | Jun-Kun Chen et.al. | 2305.03049v1 | null |
2023-05-04 | Personalize Segment Anything Model with One Shot | Renrui Zhang et.al. | 2305.03048v1 | link |
2023-05-04 | Principle-Driven Self-Alignment of Language Models from Scratch with Minimal Human Supervision | Zhiqing Sun et.al. | 2305.03047v1 | link |
2023-05-04 | OctFormer: Octree-based Transformers for 3D Point Clouds | Peng-Shuai Wang et.al. | 2305.03045v1 | link |
2023-05-04 | Single-Shot Implicit Morphable Faces with Consistent Texture Parameterization | Connor Z. Lin et.al. | 2305.03043v1 | null |
2023-05-04 | Are VAEs Bad at Reconstructing Molecular Graphs? | Hagen Muenkler et.al. | 2305.03041v1 | null |
2023-05-04 | TUVF: Learning Generalizable Texture UV Radiance Fields | An-Chieh Cheng et.al. | 2305.03040v1 | null |
2023-05-03 | Characterizing Political Bias in Automatic Summaries: A Case Study of Trump and Biden | Karen Zhou et.al. | 2305.02321v1 | link |
2023-05-03 | Generating Synthetic Documents for Cross-Encoder Re-Rankers: A Comparative Study of ChatGPT and Human Experts | Arian Askari et.al. | 2305.02320v1 | link |
2023-05-03 | Visual Chain of Thought: Bridging Logical Gaps with Multimodal Infillings | Daniel Rose et.al. | 2305.02317v1 | null |
2023-05-03 | AG3D: Learning to Generate 3D Avatars from 2D Image Collections | Zijian Dong et.al. | 2305.02312v1 | null |
2023-05-03 | Real-Time Radiance Fields for Single-Image Portrait View Synthesis | Alex Trevithick et.al. | 2305.02310v1 | null |
2023-05-03 | Calibrated Explanations: with Uncertainty Information and Counterfactuals | Helena Lofstrom et.al. | 2305.02305v1 | link |
2023-05-02 | Humans as Light Bulbs: 3D Human Reconstruction from Thermal Reflection | Ruoshi Liu et.al. | 2305.01652v1 | null |
2023-05-02 | Generalizing Dataset Distillation via Deep Generative Prior | George Cazenavette et.al. | 2305.01649v1 | link |
2023-05-02 | Sequence Modeling with Multiresolution Convolutional Memory | Jiaxin Shi et.al. | 2305.01638v1 | link |
2023-05-02 | The Benefits of Bad Advice: Autocontrastive Decoding across Model Layers | Ariel Gera et.al. | 2305.01628v1 | link |
2023-05-02 | Basic syntax from speech: Spontaneous concatenation in unsupervised deep neural networks | Gašper Beguš et.al. | 2305.01626v1 | null |
2023-05-02 | TMR: Text-to-Motion Retrieval Using Contrastive 3D Human Motion Synthesis | Mathis Petrovich et.al. | 2305.00976v1 | null |
2023-05-01 | ArK: Augmented Reality with Knowledge Interactive Emergent Ability | Qiuyuan Huang et.al. | 2305.00970v1 | null |
2023-05-01 | PMDG: Privacy for Multi-Perspective Process Mining through Data Generalization | Ryan Hildebrant et.al. | 2305.00960v1 | null |
2023-05-01 | Non-Binary LDPC Code Design for Energy-Time Entanglement Quantum Key Distribution | Debarnab Mitra et.al. | 2305.00956v1 | null |
2023-05-01 | Bridging the Gap: A Survey on Integrating (Human) Feedback for Natural Language Generation | Patrick Fernandes et.al. | 2305.00955v1 | null |
2023-04-28 | LLaMA-Adapter V2: Parameter-Efficient Visual Instruction Model | Peng Gao et.al. | 2304.15010v1 | link |
2023-04-28 | Empirical Analysis of the Strengths and Weaknesses of PEFT Techniques for LLMs | George Pu et.al. | 2304.14999v1 | null |
2023-04-28 | ChatGPT – a Blessing or a Curse for Undergraduate Computer Science Students and Instructors? | Ishika Joshi et.al. | 2304.14993v1 | null |
2023-04-28 | Robust Stackelberg Equilibria | Jiarui Gan et.al. | 2304.14990v1 | null |
2023-04-28 | Interpreting Vision and Language Generative Models with Semantic Visual Priors | Michele Cafagna et.al. | 2304.14986v1 | null |
2023-04-28 | Optimal majority rules and quantitative Condorcet properties of setwise Kemeny voting schemes | Xuan Kien Phung et.al. | 2304.14980v1 | null |
2023-04-28 | MLCopilot: Unleashing the Power of Large Language Models in Solving Machine Learning Tasks | Lei Zhang et.al. | 2304.14979v1 | link |
2023-04-27 | ChatVideo: A Tracklet-centric Multimodal and Versatile Video Understanding System | Junke Wang et.al. | 2304.14407v1 | null |
2023-04-27 | Motion-Conditioned Diffusion Model for Controllable Video Synthesis | Tsai-Shien Chen et.al. | 2304.14404v1 | null |
2023-04-27 | LaMini-LM: A Diverse Herd of Distilled Models from Large-Scale Instructions | Minghao Wu et.al. | 2304.14402v1 | link |
2023-04-27 | ActorsNeRF: Animatable Few-shot Human Rendering with Generalizable NeRFs | Jiteng Mu et.al. | 2304.14401v1 | null |
2023-04-27 | IconShop: Text-Based Vector Icon Synthesis with Autoregressive Transformers | Ronghuan Wu et.al. | 2304.14400v1 | null |
2023-04-27 | We’re Afraid Language Models Aren’t Modeling Ambiguity | Alisa Liu et.al. | 2304.14399v1 | link |
2023-04-27 | Maximizing Model Generalization for Manufacturing with Self-Supervised Learning and Federated Learning | Matthew Russell et.al. | 2304.14398v1 | null |
2023-04-27 | Learning Articulated Shape with Keypoint Pseudo-labels from Web Images | Anastasis Stathopoulos et.al. | 2304.14396v1 | null |
2023-04-27 | SeqTrack: Sequence to Sequence Learning for Visual Object Tracking | Xin Chen et.al. | 2304.14394v1 | link |
2023-04-26 | Controllable Image Generation via Collage Representations | Arantxa Casanova et.al. | 2304.13722v1 | null |
2023-04-26 | Evaluation of GPT-3.5 and GPT-4 for supporting real-world information needs in healthcare delivery | Debadutta Dash et.al. | 2304.13714v1 | null |
2023-04-27 | Harnessing the Power of LLMs in Practice: A Survey on ChatGPT and Beyond | Jingfeng Yang et.al. | 2304.13712v2 | link |
2023-04-26 | UniNeXt: Exploring A Unified Architecture for Vision Recognition | Fangjian Lin et.al. | 2304.13700v1 | link |
2023-04-26 | Hitting Subgraphs in Sparse Graphs and Geometric Intersection Graphs | Daniel Lokshtanov et.al. | 2304.13695v1 | null |
2023-04-26 | HeySQuAD: A Spoken Question Answering Dataset | Yijing Wu et.al. | 2304.13689v1 | link |
2023-04-25 | DQS3D: Densely-matched Quantization-aware Semi-supervised 3D Detection | Huan-ang Gao et.al. | 2304.13031v1 | link |
2023-04-25 | On the mechanism of polaritonic rate suppression from quantum transition paths | Michelle C. Anderson et.al. | 2304.13024v1 | null |
2023-04-25 | Seeing is not always believing: A Quantitative Study on Human Perception of AI-Generated Images | Zeyu Lu et.al. | 2304.13023v1 | link |
2023-04-25 | Certifying Ensembles: A General Certification Theory with S-Lipschitzness | Aleksandar Petrov et.al. | 2304.13019v1 | null |
2023-04-25 | Bibliometric Data Fusion for Biomedical Information Retrieval | Timo Breuer et.al. | 2304.13012v1 | null |
2023-04-25 | The Potential of Visual ChatGPT For Remote Sensing | Lucas Prado Osco et.al. | 2304.13009v1 | null |
2023-04-25 | Answering Questions by Meta-Reasoning over Multiple Chains of Thought | Ori Yoran et.al. | 2304.13007v1 | link |
2023-04-24 | Explicit Correspondence Matching for Generalizable Neural Radiance Fields | Yuedong Chen et.al. | 2304.12294v1 | link |
2023-04-24 | Synthpop++: A Hybrid Framework for Generating A Country-scale Synthetic Population | Bhavesh Neekhra et.al. | 2304.12284v1 | link |
2023-04-21 | Deep-Learning-based Fast and Accurate 3D CT Deformable Image Registration in Lung Cancer | Yuzhen Ding et.al. | 2304.11135v1 | null |
2023-04-20 | Learning Sparse and Low-Rank Priors for Image Recovery via Iterative Reweighted Least Squares Minimization | Stamatios Lefkimmiatis et.al. | 2304.10536v1 | null |
2023-04-20 | Farm3D: Learning Articulated 3D Animals by Distilling 2D Diffusion | Tomas Jakab et.al. | 2304.10535v1 | null |
2023-04-20 | Collaborative Diffusion for Multi-Modal Face Generation and Editing | Ziqi Huang et.al. | 2304.10530v1 | link |
2023-04-20 | Generalizing Neural Human Fitting to Unseen Poses With Articulated SE(3) Equivariance | Haiwen Feng et.al. | 2304.10528v1 | null |
2023-04-20 | Multidimensional Uncertainty Quantification for Deep Neural Networks | Xujiang Zhao et.al. | 2304.10527v1 | null |
2023-04-20 | GenCorres: Consistent Shape Matching via Coupled Implicit-Explicit Shape Generative Models | Haitao Yang et.al. | 2304.10523v1 | link |
2023-04-20 | Contrastive Tuning: A Little Help to Make Masked Autoencoders Forget | Johannes Lehner et.al. | 2304.10520v1 | link |
2023-04-19 | LipsFormer: Introducing Lipschitz Continuity to Vision Transformers | Xianbiao Qi et.al. | 2304.09856v1 | link |
2023-04-19 | Bridging RL Theory and Practice with the Effective Horizon | Cassidy Laidlaw et.al. | 2304.09853v1 | link |
2023-04-19 | Evaluating Verifiability in Generative Search Engines | Nelson F. Liu et.al. | 2304.09848v1 | link |
2023-04-19 | Chameleon: Plug-and-Play Compositional Reasoning with Large Language Models | Pan Lu et.al. | 2304.09842v1 | link |
2023-04-19 | Points of non-linearity of functions generated by random neural networks | David Holmes et.al. | 2304.09837v1 | null |
2023-04-18 | Optimal PAC Bounds Without Uniform Convergence | Ishaq Aden-Ali et.al. | 2304.09167v1 | null |
2023-04-18 | Exploring the Trade-Offs: Unified Large Language Models vs Local Fine-Tuned Models for Highly-Specific Radiology NLI Task | Zihao Wu et.al. | 2304.09138v1 | null |
2023-04-17 | Conditional Generation of Audio from Video via Foley Analogies | Yuexi Du et.al. | 2304.08490v1 | link |
2023-04-17 | Hyper-Decision Transformer for Efficient Online Policy Adaptation | Mengdi Xu et.al. | 2304.08487v1 | null |
2023-04-17 | Visual Instruction Tuning | Haotian Liu et.al. | 2304.08485v1 | link |
2023-04-17 | Text2Performer: Text-Driven Human Video Generation | Yuming Jiang et.al. | 2304.08483v1 | link |
2023-04-17 | Towards Robust Prompts on Vision-Language Models | Jindong Gu et.al. | 2304.08479v1 | null |
2023-04-18 | Latent-Shift: Latent Diffusion with Temporal Shift for Efficient Text-to-Video Generation | Jie An et.al. | 2304.08477v2 | null |
2023-04-14 | Cross-Entropy Loss Functions: Theoretical Analysis and Applications | Anqi Mao et.al. | 2304.07288v1 | null |
2023-04-14 | Solving Unique Games over Globally Hypercontractive Graphs | Mitali Bafna et.al. | 2304.07284v1 | null |
2023-04-14 | Synthetically Generating Human-like Data for Sequential Decision Making Tasks via Reward-Shaped Imitation Learning | Bryan Brandt et.al. | 2304.07280v1 | null |
2023-04-17 | Identifying Cluttering Edges in Near-Planar Graphs | Simon van Wageningen et.al. | 2304.07274v2 | link |
2023-04-13 | Expressive Text-to-Image Generation with Rich Text | Songwei Ge et.al. | 2304.06720v1 | null |
2023-04-13 | Single-Stage Diffusion NeRF: A Unified Approach to 3D Generation and Reconstruction | Hansheng Chen et.al. | 2304.06714v1 | link |
2023-04-13 | What does CLIP know about a red circle? Visual prompt engineering for VLMs | Aleksandar Shtedritski et.al. | 2304.06712v1 | null |
2023-04-13 | DiffusionRig: Learning Personalized Priors for Facial Appearance Editing | Zheng Ding et.al. | 2304.06711v1 | link |
2023-04-13 | How Will It Drape Like? Capturing Fabric Mechanics from Depth Images | Carlos Rodriguez-Pardo et.al. | 2304.06704v1 | null |
2023-04-13 | Learning Controllable 3D Diffusion Models from Single-view Images | Jiatao Gu et.al. | 2304.06700v1 | null |
2023-04-13 | Improving novelty detection with generative adversarial networks on hand gesture data | Miguel Simão et.al. | 2304.06696v1 | null |
2023-04-12 | Continual Diffusion: Continual Customization of Text-to-Image Diffusion with C-LoRA | James Seale Smith et.al. | 2304.06027v1 | null |
2023-04-12 | DreamPose: Fashion Image-to-Video Synthesis via Stable Diffusion | Johanna Karras et.al. | 2304.06025v1 | null |
2023-04-12 | Probabilistic Human Mesh Recovery in 3D Scenes from Egocentric Views | Siwei Zhang et.al. | 2304.06024v1 | link |
2023-04-12 | SAM Struggles in Concealed Scenes – Empirical Study on “Segment Anything” | Ge-Peng Ji et.al. | 2304.06022v1 | null |
2023-04-12 | Crowd Counting with Sparse Annotation | Shiwei Zhang et.al. | 2304.06021v1 | null |
2023-04-12 | VidStyleODE: Disentangled Video Editing via StyleGAN and NeuralODEs | Moayed Haji Ali et.al. | 2304.06020v1 | null |
2023-04-12 | Generating Aligned Pseudo-Supervision from Non-Aligned Data for Image Restoration in Under-Display Camera | Ruicheng Feng et.al. | 2304.06019v1 | link |
2023-04-12 | Bi-level Latent Variable Model for Sample-Efficient Multi-Agent Reinforcement Learning | Aravind Venugopal et.al. | 2304.06011v1 | null |
2023-04-11 | HRS-Bench: Holistic, Reliable and Scalable Benchmark for Text-to-Image Models | Eslam Mohamed Bakr et.al. | 2304.05390v1 | link |
2023-04-11 | Human-AI Co-Creation Approach to Find Forever Chemicals Replacements | Juliana Jansen Ferreira et.al. | 2304.05389v1 | null |
2023-04-11 | MOST: Multiple Object localization with Self-supervised Transformers for object discovery | Sai Saketh Rambhatla et.al. | 2304.05387v1 | null |
2023-04-11 | Bloom filters for molecules | Jorge Medina et.al. | 2304.05386v1 | link |
2023-04-10 | A Cheaper and Better Diffusion Language Model with Soft-Masked Noise | Jiaao Chen et.al. | 2304.04746v1 | link |
2023-04-10 | Ambiguous Medical Image Segmentation using Diffusion Models | Aimon Rahman et.al. | 2304.04745v1 | link |
2023-04-10 | On the Possibilities of AI-Generated Text Detection | Souradip Chakraborty et.al. | 2304.04736v1 | null |
2023-04-07 | Embodied Concept Learner: Self-supervised Learning of Concepts and Mapping through Instruction Following | Mingyu Ding et.al. | 2304.03767v1 | null |
2023-04-07 | Language Models are Causal Knowledge Extractors for Zero-shot Video Question Answering | Hung-Ting Su et.al. | 2304.03754v1 | null |
2023-04-07 | V3Det: Vast Vocabulary Visual Detection Dataset | Jiaqi Wang et.al. | 2304.03752v1 | null |
2023-04-07 | Perspectives on AI Architectures and Co-design for Earth System Predictability | Maruti K. Mudunuru et.al. | 2304.03748v1 | null |
2023-04-07 | Assessing Perceived Fairness from Machine Learning Developer’s Perspective | Anoop Mishra et.al. | 2304.03745v1 | null |
2023-04-06 | Diffusion Models as Masked Autoencoders | Chen Wei et.al. | 2304.03283v1 | null |
2023-04-06 | Do the Rewards Justify the Means? Measuring Trade-Offs Between Rewards and Ethical Behavior in the MACHIAVELLI Benchmark | Alexander Pan et.al. | 2304.03279v1 | link |
2023-04-06 | How Do US Congress Members Advertise Climate Change: An Analysis Of Ads Run On Meta’s Platforms | Laurenz Aisenpreis et.al. | 2304.03278v1 | null |
2023-04-06 | Instruction Tuning with GPT-4 | Baolin Peng et.al. | 2304.03277v1 | link |
2023-04-06 | That’s What I Said: Fully-Controllable Talking Face Generation | Youngjoon Jang et.al. | 2304.03275v1 | null |
2023-04-06 | Towards self-driving laboratories in chemistry and materials sciences: The central role of DFT in the era of AI | Bing Huang et.al. | 2304.03272v1 | null |
2023-04-06 | Causal Discovery with Score Matching on Additive Models with Arbitrary Noise | Francesco Montagna et.al. | 2304.03265v1 | null |
2023-04-05 | Taming Encoder for Zero Fine-tuning Image Customization with Text-to-Image Diffusion Models | Xuhui Jia et.al. | 2304.02642v1 | null |
2023-04-05 | ENTL: Embodied Navigation Trajectory Learner | Klemen Kotar et.al. | 2304.02639v1 | null |
2023-04-05 | GenPhys: From Physical Processes to Generative Models | Ziming Liu et.al. | 2304.02637v1 | null |
2023-04-05 | HNeRV: A Hybrid Neural Representation for Videos | Hao Chen et.al. | 2304.02633v1 | link |
2023-04-05 | Towards Explainable AI Writing Assistants for Non-native English Speakers | Yewon Kim et.al. | 2304.02625v1 | null |
2023-04-05 | High-fidelity Pseudo-labels for Boosting Weakly-Supervised Segmentation | Arvi Jonnarth et.al. | 2304.02621v1 | link |
2023-04-04 | Large Language Models are Edge-Case Fuzzers: Testing Deep Learning Libraries via FuzzGPT | Yinlin Deng et.al. | 2304.02014v1 | null |
2023-04-04 | NPC: Neural Point Characters from Video | Shih-Yang Su et.al. | 2304.02013v1 | null |
2023-04-04 | EGC: Image Generation and Classification via a Single Energy-Based Model | Qiushan Guo et.al. | 2304.02012v1 | link |
2023-04-04 | FakET: Simulating Cryo-Electron Tomograms with Neural Style Transfer | Pavol Harar et.al. | 2304.02011v1 | link |
2023-04-04 | OrienterNet: Visual Localization in 2D Public Maps with Neural Matching | Paul-Edouard Sarlin et.al. | 2304.02009v1 | null |
2023-04-04 | MonoHuman: Animatable Human Neural Field from Monocular Video | Zhengming Yu et.al. | 2304.02001v1 | null |
2023-04-04 | Revisiting the Evaluation of Image Synthesis with GANs | Mengping Yang et.al. | 2304.01999v1 | link |
2023-04-03 | Video Instance Segmentation in an Open-World | Omkar Thawakar et.al. | 2304.01200v1 | link |
2023-04-03 | Zero-Shot Semantic Segmentation with Decoupled One-Pass Network | Cong Han et.al. | 2304.01198v1 | link |
2023-04-03 | Bringing Telepresence to Every Desk | Shengze Wang et.al. | 2304.01197v1 | null |
2023-04-04 | Baize: An Open-Source Chat Model with Parameter-Efficient Tuning on Self-Chat Data | Canwen Xu et.al. | 2304.01196v2 | link |
2023-04-03 | Burstormer: Burst Image Restoration and Enhancement Transformer | Akshay Dudhane et.al. | 2304.01194v1 | link |
2023-04-03 | Follow Your Pose: Pose-Guided Text-to-Video Generation using Pose-Free Videos | Yue Ma et.al. | 2304.01186v1 | link |
2023-04-03 | Whistler Wave Observations by \textit{Parker Solar Probe} During Encounter $1$ : Counter-Propagating Whistlers Collocated with Magnetic Field Inhomogeneities and their Application to Electric Field Measurement Calibration | S. Karbashewski et.al. | 2304.01185v1 | null |
2023-03-31 | Towards Flexible Multi-modal Document Models | Naoto Inoue et.al. | 2303.18248v1 | link |
2023-03-31 | Speeding up Madgraph5 aMC@NLO through CPU vectorization and GPU offloading: towards a first alpha release | Andrea Valassi et.al. | 2303.18244v1 | null |
2023-03-31 | $\infty$ -Diff: Infinite Resolution Diffusion with Subsampled Mollified States | Sam Bond-Taylor et.al. | 2303.18242v1 | link |
2023-03-31 | Procedure-Aware Pretraining for Instructional Video Understanding | Honglu Zhou et.al. | 2303.18230v1 | link |
2023-03-31 | A Survey of Large Language Models | Wayne Xin Zhao et.al. | 2303.18223v1 | link |
2023-03-31 | SemHint-MD: Learning from Noisy Semantic Labels for Self-Supervised Monocular Depth Estimation | Shan Lin et.al. | 2303.18219v1 | null |
2023-03-31 | A Closer Look at Few-Shot 3D Point Cloud Classification | Chuangguan Ye et.al. | 2303.18210v1 | link |
2023-03-30 | AvatarCraft: Transforming Text into Neural Human Avatars with Parameterized Shape and Pose Control | Ruixiang Jiang et.al. | 2303.17606v1 | link |
2023-03-30 | Token Merging for Fast Stable Diffusion | Daniel Bolya et.al. | 2303.17604v1 | link |
2023-03-30 | NeRF-Supervised Deep Stereo | Fabio Tosi et.al. | 2303.17603v1 | link |
2023-03-30 | Beyond Appearance: a Semantic Controllable Self-Supervised Learning Framework for Human-Centric Visual Tasks | Weihua Chen et.al. | 2303.17602v1 | link |
2023-03-30 | When Learning Is Out of Reach, Reset: Generalization in Autonomous Visuomotor Reinforcement Learning | Zichen Zhang et.al. | 2303.17600v1 | null |
2023-03-30 | Zero-Shot Video Editing Using Off-The-Shelf Image Diffusion Models | Wen Wang et.al. | 2303.17599v1 | link |
2023-03-30 | Consistent View Synthesis with Pose-Guided Diffusion Models | Hung-Yu Tseng et.al. | 2303.17598v1 | null |
2023-03-30 | MobileInst: Video Instance Segmentation on the Mobile | Renhong Zhang et.al. | 2303.17594v1 | null |
2023-03-29 | AutoAD: Movie Description in Context | Tengda Han et.al. | 2303.16899v1 | link |
2023-03-29 | Bagging by Learning to Singulate Layers Using Interactive Perception | Lawrence Yunliang Chen et.al. | 2303.16898v1 | null |
2023-03-29 | Physics-Driven Diffusion Models for Impact Sound Synthesis from Videos | Kun Su et.al. | 2303.16897v1 | null |
2023-03-29 | Multi-scale Hierarchical Vision Transformer with Cascaded Attention Decoding for Medical Image Segmentation | Md Mostafijur Rahman et.al. | 2303.16892v1 | link |
2023-03-29 | Mask-free OVIS: Open-Vocabulary Instance Segmentation without Manual Mask Annotations | Vibashan VS et.al. | 2303.16891v1 | null |
2023-03-29 | DPF: Learning Dense Prediction Fields with Weak Supervision | Xiaoxue Chen et.al. | 2303.16890v1 | link |
2023-03-29 | Towards Understanding the Effect of Pretraining Label Granularity | Guan Zhe Hong et.al. | 2303.16887v1 | null |
2023-03-29 | End-to-End $n$ -ary Relation Extraction for Combination Drug Therapies | Yuhang Jiang et.al. | 2303.16886v1 | link |
2023-03-29 | Instant Neural Radiance Fields Stylization | Shaoxu Li et.al. | 2303.16884v1 | link |
2023-03-29 | Your Diffusion Model is Secretly a Zero-Shot Classifier | Alexander C. Li et.al. | 2303.16203v2 | link |
2023-03-28 | LLaMA-Adapter: Efficient Fine-tuning of Language Models with Zero-init Attention | Renrui Zhang et.al. | 2303.16199v1 | link |
2023-03-28 | BC-IRL: Learning Generalizable Reward Functions from Demonstrations | Andrew Szot et.al. | 2303.16194v1 | null |
2023-03-28 | Planning with Sequence Models through Iterative Energy Minimization | Hongyi Chen et.al. | 2303.16189v1 | null |
2023-03-28 | Visual Chain-of-Thought Diffusion Models | William Harvey et.al. | 2303.16187v1 | link |
2023-03-28 | Label Smoothing Improves Neural Source Code Summarization | Sakib Haque et.al. | 2303.16178v1 | null |
2023-03-27 | IRFL: Image Recognition of Figurative Language | Ron Yosef et.al. | 2303.15445v1 | link |
2023-03-27 | Zero-shot Model Diagnosis | Jinqi Luo et.al. | 2303.15441v1 | null |
2023-03-27 | FaceLit: Neural 3D Relightable Faces | Anurag Ranjan et.al. | 2303.15437v1 | null |
2023-03-27 | The Stable Signature: Rooting Watermarks in Latent Diffusion Models | Pierre Fernandez et.al. | 2303.15435v1 | link |
2023-03-27 | Anti-DreamBooth: Protecting users from personalized text-to-image synthesis | Thanh Van Le et.al. | 2303.15433v1 | link |
2023-03-27 | TextMI: Textualize Multimodal Information for Integrating Non-verbal Cues in Pre-trained Language Models | Md Kamrul Hasan et.al. | 2303.15430v1 | null |
2023-03-27 | JAWS: Just A Wild Shot for Cinematic Transfer in Neural Radiance Fields | Xi Wang et.al. | 2303.15427v1 | link |
2023-03-24 | Masked Scene Contrast: A Scalable Framework for Unsupervised 3D Representation Learning | Xiaoyang Wu et.al. | 2303.14191v1 | link |
2023-03-24 | Learning from Few Demonstrations with Frame-Weighted Motion Generation | Jianyong Sun et.al. | 2303.14188v1 | null |
2023-03-24 | Make-It-3D: High-Fidelity 3D Creation from A Single Image with Diffusion Prior | Junshu Tang et.al. | 2303.14184v1 | link |
2023-03-24 | Scaling Expert Language Models with Unsupervised Domain Discovery | Suchin Gururangan et.al. | 2303.14177v1 | link |
2023-03-24 | A Hybrid ANN-SNN Architecture for Low-Power and Low-Latency Visual Perception | Asude Aydin et.al. | 2303.14176v1 | null |
2023-03-24 | UrbanGIRAFFE: Representing Urban Scenes as Compositional Generative Neural Feature Fields | Yuanbo Yang et.al. | 2303.14167v1 | null |
2023-03-23 | Ablating Concepts in Text-to-Image Diffusion Models | Nupur Kumari et.al. | 2303.13516v1 | link |
2023-03-23 | Persistent Nature: A Generative Model of Unbounded 3D Worlds | Lucy Chai et.al. | 2303.13515v1 | link |
2023-03-23 | DreamBooth3D: Subject-Driven Text-to-3D Generation | Amit Raj et.al. | 2303.13508v1 | null |
2023-03-23 | A Large-scale Study of Spatiotemporal Representation Learning with a New Benchmark on Action Recognition | Andong Deng et.al. | 2303.13505v1 | link |
2023-03-23 | Chordal Averaging on Flag Manifolds and Its Applications | Nathan Mankovich et.al. | 2303.13501v1 | link |
2023-03-23 | A Closer Look at Model Adaptation using Feature Distortion and Simplicity Bias | Puja Trivedi et.al. | 2303.13500v1 | null |
2023-03-23 | TriPlaneNet: An Encoder for EG3D Inversion | Ananta R. Bhattarai et.al. | 2303.13497v1 | null |
2023-03-22 | Diffuse-Denoise-Count: Accurate Crowd-Counting with Diffusion Models | Yasiru Ranasinghe et.al. | 2303.12790v1 | link |
2023-03-22 | EPro-PnP: Generalized End-to-End Probabilistic Perspective-n-Points for Monocular Object Pose Estimation | Hansheng Chen et.al. | 2303.12787v1 | link |
2023-03-22 | Localization-based OFDM framework for RIS-aided systems | Fabio Saggese et.al. | 2303.12763v1 | link |
2023-03-22 | MaskCon: Masked Contrastive Learning for Coarse-Labelled Dataset | Chen Feng et.al. | 2303.12756v1 | link |
2023-03-22 | Invariants for time-dependent Hamiltonian systems | Jürgen Struckmeier et.al. | 2303.12746v1 | null |
2023-03-22 | Comment on the elastica section in Thorne and Blandford “Modern Classical Physics”, the shape of things, and the aspect ratio of reality | J. A. Hanna et.al. | 2303.12729v1 | null |
2023-03-21 | Natural Language-Assisted Sign Language Recognition | Ronglai Zuo et.al. | 2303.12080v1 | link |
2023-03-21 | Two-shot Video Object Segmentation | Kun Yan et.al. | 2303.12078v1 | link |
2023-03-21 | CC3D: Layout-Conditioned Generation of Compositional 3D Scenes | Sherwin Bahmani et.al. | 2303.12074v1 | null |
2023-03-21 | ProphNet: Efficient Agent-Centric Motion Forecasting with Anchor-Informed Proposals | Xishun Wang et.al. | 2303.12071v1 | null |
2023-03-21 | Machine Learning for Brain Disorders: Transformers and Visual Transformers | Robin Courant et.al. | 2303.12068v1 | null |
2023-03-20 | EVA-02: A Visual Representation for Neon Genesis | Yuxin Fang et.al. | 2303.11331v1 | link |
2023-03-20 | Sound Localization from Motion: Jointly Learning Sound Direction and Camera Rotation | Ziyang Chen et.al. | 2303.11329v1 | link |
2023-03-20 | Zero-1-to-3: Zero-shot One Image to 3D Object | Ruoshi Liu et.al. | 2303.11328v1 | link |
2023-03-20 | Open-vocabulary Panoptic Segmentation with Embedding Modulation | Xi Chen et.al. | 2303.11324v1 | null |
2023-03-20 | ScribbleSeg: Scribble-based Interactive Image Segmentation | Xi Chen et.al. | 2303.11320v1 | null |
2023-03-20 | Generative Semantic Segmentation | Jiaqi Chen et.al. | 2303.11316v1 | link |
2023-03-20 | waywiser: Ergonomic Methods for Assessing Spatial Models | Michael J Mahoney et.al. | 2303.11312v1 | link |
2023-03-17 | Data-centric Artificial Intelligence: A Survey | Daochen Zha et.al. | 2303.10158v1 | link |
2023-03-17 | CoVIO: Online Continual Learning for Visual-Inertial Odometry | Niclas Vödisch et.al. | 2303.10149v1 | link |
2023-03-17 | CoDEPS: Online Continual Learning for Depth Estimation and Panoptic Segmentation | Niclas Vödisch et.al. | 2303.10147v1 | link |
2023-03-17 | Dynamic Update-to-Data Ratio: Minimizing World Model Overfitting | Nicolai Dorka et.al. | 2303.10144v1 | link |
2023-03-16 | Efficient Diffusion Training via Min-SNR Weighting Strategy | Tiankai Hang et.al. | 2303.09556v1 | link |
2023-03-16 | PartNeRF: Generating Part-Aware Editable 3D Shapes without 3D Supervision | Konstantinos Tertikas et.al. | 2303.09554v1 | null |
2023-03-16 | SurroundOcc: Multi-Camera 3D Occupancy Prediction for Autonomous Driving | Yi Wei et.al. | 2303.09551v1 | link |
2023-03-16 | Diffusion-HPC: Generating Synthetic Images with Realistic Humans | Zhenzhen Weng et.al. | 2303.09541v1 | link |
2023-03-16 | Deep Metric Learning for Unsupervised Remote Sensing Change Detection | Wele Gedara Chaminda Bandara et.al. | 2303.09536v1 | link |
2023-03-17 | FateZero: Fusing Attentions for Zero-shot Text-based Video Editing | Chenyang Qi et.al. | 2303.09535v2 | link |
2023-03-16 | Tackling Clutter in Radar Data – Label Generation and Detection Using PointNet++ | Johannes Kopp et.al. | 2303.09530v1 | link |
2023-03-15 | Borda Regret Minimization for Generalized Linear Dueling Bandits | Yue Wu et.al. | 2303.08816v1 | null |
2023-03-15 | BiFormer: Vision Transformer with Bi-Level Routing Attention | Lei Zhu et.al. | 2303.08810v1 | link |
2023-03-15 | Stochastic Interpolants: A Unifying Framework for Flows and Diffusions | Michael S. Albergo et.al. | 2303.08797v1 | null |
2023-03-15 | PLEX: Making the Most of the Available Data for Robotic Manipulation Pretraining | Garrett Thomas et.al. | 2303.08789v1 | null |
2023-03-14 | Diversity-Aware Meta Visual Prompting | Qidong Huang et.al. | 2303.08138v1 | link |
2023-03-14 | LayoutDM: Discrete Diffusion Model for Controllable Layout Generation | Naoto Inoue et.al. | 2303.08137v1 | link |
2023-03-15 | Manipulate by Seeing: Creating Manipulation Controllers from Pre-Trained Representations | Jianren Wang et.al. | 2303.08135v2 | null |
2023-03-14 | MeshDiffusion: Score-based Generative 3D Mesh Modeling | Zhen Liu et.al. | 2303.08133v1 | link |
2023-03-15 | A Simple Framework for Open-Vocabulary Segmentation and Detection | Hao Zhang et.al. | 2303.08131v2 | link |
2023-03-14 | ViperGPT: Visual Inference via Python Execution for Reasoning | Dídac Surís et.al. | 2303.08128v1 | link |
2023-03-14 | Blind Video Deflickering by Neural Filtering with a Flawed Atlas | Chenyang Lei et.al. | 2303.08120v1 | link |
2023-03-14 | Parameterised Approximation of the Fixation Probability of the Dominant Mutation in the Multi-Type Moran Process | Leslie Ann Goldberg et.al. | 2303.08118v1 | null |
2023-03-13 | Revisiting Class-Incremental Learning with Pre-Trained Models: Generalizability and Adaptivity are All You Need | Da-Wei Zhou et.al. | 2303.07338v1 | link |
2023-03-13 | Lite DETR : An Interleaved Multi-Scale Encoder for Efficient DETR | Feng Li et.al. | 2303.07335v1 | link |
2023-03-13 | A Smoothing Algorithm for Minimum Sensing Path Plans in Gaussian Belief Space | Ali Reza Pedram et.al. | 2303.07326v1 | null |
2023-03-13 | Collision Cross-entropy and EM Algorithm for Self-labeled Classification | Zhongwen Zhang et.al. | 2303.07321v1 | null |
2023-03-13 | Linear regularized 13-moment equations with Onsager boundary conditions for general gas molecules | Zhenning Cai et.al. | 2303.07314v1 | null |
2023-03-13 | An efficient phase-field model of shear fractures using deviatoric stress split | Ehsan Haghighat et.al. | 2303.07309v1 | link |
2023-03-10 | Multiple Hands Make Light Work: Enhancing Quality and Diversity using MAP-Elites with Multiple Parallel Evolution Strategies | Manon Flageat et.al. | 2303.06137v1 | null |
2023-03-10 | Rewarding Chatbots for Real-World Engagement with Millions of Users | Robert Irvine et.al. | 2303.06135v1 | null |
2023-03-10 | Imaging the crustal and upper mantle structure of the North Anatolian Fault: A Transmission Matrix Framework for Local Adaptive Focusing | Rita Touma et.al. | 2303.06123v1 | null |
2023-03-10 | Ignorance is Bliss: Robust Control via Information Gating | Manan Tomar et.al. | 2303.06121v1 | null |
2023-03-11 | Wave-function parametrization of a probability measure | Leonardo Pedro et.al. | 2303.06069v1 | null |
2023-03-09 | Scaling up GANs for Text-to-Image Synthesis | Minguk Kang et.al. | 2303.05511v1 | null |
2023-03-09 | Planning with Large Language Models for Code Generation | Shun Zhang et.al. | 2303.05510v1 | null |
2023-03-09 | Cherry-Picking with Reinforcement Learning | Yunchu Zhang et.al. | 2303.05508v1 | null |
2023-03-09 | TANGOS: Regularizing Tabular Neural Networks through Gradient Orthogonalization and Specialization | Alan Jeffares et.al. | 2303.05506v1 | link |
2023-03-09 | Open-world Instance Segmentation: Top-down Learning with Bottom-up Supervision | Tarun Kalluri et.al. | 2303.05503v1 | null |
2023-03-09 | PDSketch: Integrated Planning Domain Programming and Learning | Jiayuan Mao et.al. | 2303.05501v1 | null |
2023-03-10 | Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection | Shilong Liu et.al. | 2303.05499v2 | link |
2023-03-09 | Learning Stationary Markov Processes with Contrastive Adjustment | Ludvig Bergenstråhle et.al. | 2303.05497v1 | link |
2023-03-09 | Sparse and Local Networks for Hypergraph Reasoning | Guangxuan Xiao et.al. | 2303.05496v1 | null |
2023-03-08 | Open-Vocabulary Panoptic Segmentation with Text-to-Image Diffusion Models | Jiarui Xu et.al. | 2303.04803v1 | link |
2023-03-08 | Stabilized profunctors and stable species of structures | Marcelo Fiore et.al. | 2303.04795v1 | null |
2023-03-08 | Multilevel Diffusion: Infinite Dimensional Score-Based Diffusion Models for Image Generation | Paul Hagemann et.al. | 2303.04772v1 | link |
2023-03-08 | SMaLL: A Software Framework for portable Machine Learning Libraries | Upasana Sridhar et.al. | 2303.04769v1 | null |
2023-03-07 | Benign Overfitting for Two-layer ReLU Networks | Yiwen Kou et.al. | 2303.04145v1 | link |
2023-03-07 | Toward Defining a Domain Complexity Measure Across Domains | Katarina Doctor et.al. | 2303.04141v1 | null |
2023-03-07 | Diffusion Policy: Visuomotor Policy Learning via Action Diffusion | Cheng Chi et.al. | 2303.04137v1 | null |
2023-03-07 | Inadequacy of equivalent circuits in nonlinear systems with inherent memory | V. Lopez-Richard et.al. | 2303.04135v1 | null |
2023-03-07 | Exploiting Asymmetry for Synthetic Training Data Generation: SynthIE and the Case of Information Extraction | Martin Josifoski et.al. | 2303.04132v1 | link |
2023-03-07 | Foundation Models for Decision Making: Problems, Methods, and Opportunities | Sherry Yang et.al. | 2303.04129v1 | null |
2023-03-07 | Private Read-Update-Write with Controllable Information Leakage for Storage-Efficient Federated Learning with Top $r$ Sparsification | Sajani Vithana et.al. | 2303.04123v1 | null |
2023-03-06 | Restoration-Degradation Beyond Linear Diffusions: A Non-Asymptotic Analysis For DDIM-Type Samplers | Sitan Chen et.al. | 2303.03384v1 | null |
2023-03-06 | SUREL+: Moving from Walks to Sets for Scalable Subgraph-based Graph Representation Learning | Haoteng Yin et.al. | 2303.03379v1 | link |
2023-03-06 | PaLM-E: An Embodied Multimodal Language Model | Danny Driess et.al. | 2303.03378v1 | null |
2023-03-06 | MAESTRO: Open-Ended Environment Design for Multi-Agent Reinforcement Learning | Mikayel Samvelyan et.al. | 2303.03376v1 | null |
2023-03-06 | Detecting Human-Object Contact in Images | Yixin Chen et.al. | 2303.03373v1 | link |
2023-03-06 | ALMOST: Adversarial Learning to Mitigate Oracle-less ML Attacks via Synthesis Tuning | Animesh Basak Chowdhury et.al. | 2303.03372v1 | null |
2023-03-06 | Complex Systems of Secrecy: The Offshore Networks of Oligarchs | Ho-Chun Herbert Chang et.al. | 2303.03371v1 | null |
2023-03-06 | Multimodal Prompting with Missing Modalities for Visual Recognition | Yi-Lun Lee et.al. | 2303.03369v1 | link |
2023-03-06 | Referring Multi-Object Tracking | Dongming Wu et.al. | 2303.03366v1 | link |
2023-03-06 | Efficient Skill Acquisition for Complex Manipulation Tasks in Obstructed Environments | Jun Yamada et.al. | 2303.03365v1 | null |
2023-03-03 | Unleashing Text-to-Image Diffusion Models for Visual Perception | Wenliang Zhao et.al. | 2303.02153v1 | link |
2023-03-03 | Prompt, Generate, then Cache: Cascade of Foundation Models makes Strong Few-shot Learners | Renrui Zhang et.al. | 2303.02151v1 | link |
2023-03-03 | Sparsity May Cry: Let Us Fail (Current) Sparse Neural Networks Together! | Shiwei Liu et.al. | 2303.02141v1 | link |
2023-03-03 | Eventual Discounting Temporal Logic Counterfactual Experience Replay | Cameron Voloshin et.al. | 2303.02135v1 | null |
2023-03-02 | Dropout Reduces Underfitting | Zhuang Liu et.al. | 2303.01500v1 | link |
2023-03-02 | Predicting Motion Plans for Articulating Everyday Objects | Arjun Gupta et.al. | 2303.01484v1 | null |
2023-03-02 | Faster exact and approximation algorithms for packing and covering matroids via push-relabel | Kent Quanrud et.al. | 2303.01478v1 | null |
2023-03-01 | StraIT: Non-autoregressive Generation with Stratified Image Transformer | Shengju Qian et.al. | 2303.00750v1 | null |
2023-03-01 | Coordination of Multiple Robots along Given Paths with Bounded Junction Complexity | Mikkel Abrahamsen et.al. | 2303.00745v1 | null |
2023-03-01 | READ Avatars: Realistic Emotion-controllable Audio Driven Avatars | Jack Saunders et.al. | 2303.00744v1 | null |
2023-03-01 | R-U-SURE? Uncertainty-Aware Code Suggestions By Maximizing Utility Across Random User Intents | Daniel D. Johnson et.al. | 2303.00732v1 | link |
2023-03-01 | A Systematic Analysis of Vocabulary and BPE Settings for Optimal Fine-tuning of NMT: A Case Study of In-domain Translation | J. Pourmostafa Roshan Sharami et.al. | 2303.00722v1 | null |
2023-02-28 | An Efficient Tester-Learner for Halfspaces | Aravind Gollakota et.al. | 2302.14853v1 | null |
2023-02-27 | Internet Explorer: Targeted Representation Learning on the Open Web | Alexander C. Li et.al. | 2302.14051v1 | link |
2023-02-27 | Language Is Not All You Need: Aligning Perception with Language Models | Shaohan Huang et.al. | 2302.14045v1 | link |
2023-02-27 | Permutation Equivariant Neural Functionals | Allan Zhou et.al. | 2302.14040v1 | link |
2023-02-27 | Measurement of Orbital Angular Momentum of Light using Stokes Parameters and Barnett’s Formalism | Anirban Debnath et.al. | 2302.14025v1 | null |
2023-02-27 | Diacritic Recognition Performance in Arabic ASR | Hanan Aldarmaki et.al. | 2302.14022v1 | null |
2023-02-27 | Full Stack Optimization of Transformer Inference: a Survey | Sehoon Kim et.al. | 2302.14017v1 | null |
2023-02-24 | SplineCam: Exact Visualization and Characterization of Deep Network Geometry and Decision Boundaries | Ahmed Imtiaz Humayun et.al. | 2302.12828v1 | link |
2023-02-24 | Generative Models of Huge Objects | Lunjia Hu et.al. | 2302.12823v1 | null |
2023-02-24 | Automatic Prompt Augmentation and Selection with Chain-of-Thought from Labeled Data | KaShun Shum et.al. | 2302.12822v1 | link |
2023-02-24 | GraphSR: A Data Augmentation Algorithm for Imbalanced Node Classification | Mengting Zhou et.al. | 2302.12814v1 | null |
2023-02-24 | Check Your Facts and Try Again: Improving Large Language Models with External Knowledge and Automated Feedback | Baolin Peng et.al. | 2302.12813v1 | null |
2023-02-23 | Change is Hard: A Closer Look at Subpopulation Shift | Yuzhe Yang et.al. | 2302.12254v1 | link |
2023-02-23 | Boosting Adversarial Transferability using Dynamic Cues | Muzammal Naseer et.al. | 2302.12252v1 | null |
2023-02-23 | VoxFormer: Sparse Voxel Transformer for Camera-based 3D Semantic Scene Completion | Yiming Li et.al. | 2302.12251v1 | link |
2023-02-23 | Sequence-Based Incremental Concolic Testing of RTL Models | Hasini Witharana et.al. | 2302.12241v1 | null |
2023-02-23 | What makes a language easy to deep-learn? | Lukas Galke et.al. | 2302.12239v1 | link |
2023-02-23 | Improving Adaptive Conformal Prediction Using Self-Supervised Learning | Nabeel Seedat et.al. | 2302.12238v1 | link |
2023-02-23 | Learning Neural Volumetric Representations of Dynamic Humans in Minutes | Chen Geng et.al. | 2302.12237v1 | link |
2023-02-23 | DiffusioNeRF: Regularizing Neural Radiance Fields with Denoising Diffusion Models | Jamie Wynn et.al. | 2302.12231v1 | link |
2023-02-22 | Beyond optimal disturbances: a statistical framework for transient growth | Peter Frame et.al. | 2302.11564v1 | null |
2023-02-22 | Uncovering Bias in Face Generation Models | Cristian Muñoz et.al. | 2302.11562v1 | null |
2023-02-22 | Equivariant Polynomials for Graph Neural Networks | Omri Puny et.al. | 2302.11556v1 | null |
2023-02-22 | RoboNinja: Learning an Adaptive Cutting Policy for Multi-Material Objects | Zhenjia Xu et.al. | 2302.11553v1 | null |
2023-02-22 | Reduce, Reuse, Recycle: Compositional Generation with Energy-Based Diffusion Models and MCMC | Yilun Du et.al. | 2302.11552v1 | link |
2023-02-22 | Scaling Robot Learning with Semantically Imagined Experience | Tianhe Yu et.al. | 2302.11550v1 | null |
2023-02-21 | Some Fundamental Aspects about Lipschitz Continuity of Neural Network Functions | Grigory Khromov et.al. | 2302.10886v1 | null |
2023-02-21 | Context-Aware Timewise VAEs for Real-Time Vehicle Trajectory Prediction | Pei Xu et.al. | 2302.10873v1 | link |
2023-02-21 | Efficient CTC Regularization via Coarse Labels for End-to-End Speech Translation | Biao Zhang et.al. | 2302.10871v1 | link |
2023-02-21 | Provable Copyright Protection for Generative Models | Nikhil Vyas et.al. | 2302.10870v1 | null |
2023-02-21 | A Unifying Perspective on Multi-Calibration: Unleashing Game Dynamics for Multi-Objective Learning | Nika Haghtalab et.al. | 2302.10863v1 | null |
2023-02-20 | Towards Universal Fake Image Detectors that Generalize Across Generative Models | Utkarsh Ojha et.al. | 2302.10174v1 | link |
2023-02-20 | Identity-Based Attribute Prototypes Distinguish Communities on Twitter | Thomas Magelinski et.al. | 2302.10172v1 | null |
2023-02-20 | Compressed Error HARQ: Feedback Communication on Noise-Asymmetric Channels | Sravan Kumar Ankireddy et.al. | 2302.10170v1 | link |
2023-02-20 | Learning Deep Semantics for Test Completion | Pengyu Nie et.al. | 2302.10166v1 | link |
2023-02-20 | Sparse PCA Beyond Covariance Thresholding | Gleb Novikov et.al. | 2302.10158v1 | null |
2023-02-17 | Consistent Diffusion Models: Mitigating Sampling Drift by Learning to be Consistent | Giannis Daras et.al. | 2302.09057v1 | link |
2023-02-17 | Geometric description of clustering in directed networks | Antoine Allard et.al. | 2302.09055v1 | link |
2023-02-17 | MiDi: Mixed Graph and 3D Denoising Diffusion for Molecule Generation | Clement Vignac et.al. | 2302.09048v1 | link |
2023-02-17 | From User Perceptions to Technical Improvement: Enabling People Who Stuter to Beter Use Speech Recognition | Colin Lea et.al. | 2302.09044v1 | null |
2023-02-17 | Privately Customizing Prefinetuning to Better Match User Data in Federated Learning | Charlie Hou et.al. | 2302.09042v1 | null |
2023-02-16 | Text-driven Visual Synthesis with Latent Diffusion Prior | Ting-Hsuan Liao et.al. | 2302.08510v1 | null |
2023-02-16 | 3D-aware Conditional Image Synthesis | Kangle Deng et.al. | 2302.08509v1 | link |
2023-02-16 | The Scope of Multicalibration: Characterizing Multicalibration via Property Elicitation | Georgy Noarov et.al. | 2302.08507v1 | null |
2023-02-15 | Target Specific De Novo Design of Drug Candidate Molecules with Graph Transformer-based Generative Adversarial Networks | Atabey Ünlü et.al. | 2302.07868v1 | link |
2023-02-15 | Learning Performance-Improving Code Edits | Aman Madaan et.al. | 2302.07867v1 | link |
2023-02-15 | Dataset Interfaces: Diagnosing Model Failures Using Controllable Counterfactual Generation | Joshua Vendrow et.al. | 2302.07865v1 | link |
2023-02-15 | Big Little Transformer Decoder | Sehoon Kim et.al. | 2302.07863v1 | link |
2023-02-15 | One-Shot Face Video Re-enactment using Hybrid Latent Spaces of StyleGAN2 | Trevine Oorloff et.al. | 2302.07848v1 | null |
2023-02-15 | NL2CMD: An Updated Workflow for Natural Language to Bash Commands Translation | Quchen Fu et.al. | 2302.07845v1 | link |
2023-02-14 | Where to Diffuse, How to Diffuse, and How to Get Back: Automated Learning for Multivariate Diffusions | Raghav Singhal et.al. | 2302.07261v1 | null |
2023-02-14 | ChatCAD: Interactive Computer-Aided Diagnosis on Medical Image using Large Language Models | Sheng Wang et.al. | 2302.07257v1 | link |
2023-02-14 | Energy Transformer | Benjamin Hoover et.al. | 2302.07253v1 | link |
2023-02-14 | Generation Probabilities Are Not Enough: Exploring the Effectiveness of Uncertainty Highlighting in AI-Powered Code Completions | Helena Vasconcelos et.al. | 2302.07248v1 | null |
2023-02-14 | A Deep Probabilistic Spatiotemporal Framework for Dynamic Graph Representation Learning with Application to Brain Disorder Identification | Junn Yong Loo et.al. | 2302.07243v1 | null |
2023-02-14 | Parker Solar Probe Observations of High Plasma Beta Solar Wind from Streamer Belt | Jia Huang et.al. | 2302.07230v1 | null |
2023-02-13 | 3D-aware Blending with Generative NeRFs | Hyunsu Kim et.al. | 2302.06608v1 | link |
2023-02-13 | Generative Adversarial Equilibrium Solvers | Denizalp Goktas et.al. | 2302.06607v1 | null |
2023-02-13 | Breaking the Curse of Multiagency: Provably Efficient Decentralized Multi-Agent RL with Function Approximation | Yuanhao Wang et.al. | 2302.06606v1 | null |
2023-02-13 | FilFL: Accelerating Federated Learning via Client Filtering | Fares Fourati et.al. | 2302.06599v1 | null |
2023-02-13 | The Impact of AI on Developer Productivity: Evidence from GitHub Copilot | Sida Peng et.al. | 2302.06590v1 | null |
2023-02-13 | Improving Out-of-Distribution Generalization of Neural Rerankers with Contextualized Late Interaction | Xinyu Zhang et.al. | 2302.06589v1 | null |
2023-02-13 | Raising the Cost of Malicious AI-Powered Image Editing | Hadi Salman et.al. | 2302.06588v1 | link |
2023-02-13 | AbLit: A Resource for Analyzing and Generating Abridged Versions of English Literature | Melissa Roemmele et.al. | 2302.06579v1 | link |
2023-02-10 | Project and Probe: Sample-Efficient Domain Adaptation by Interpolating Orthogonal Features | Annie S. Chen et.al. | 2302.05441v1 | null |
2023-02-09 | RelightableHands: Efficient Neural Relighting of Articulated Hand Models | Shun Iwase et.al. | 2302.04866v1 | null |
2023-02-09 | Polynomial Neural Fields for Subband Decomposition and Manipulation | Guandao Yang et.al. | 2302.04862v1 | link |
2023-02-09 | Re-ViLM: Retrieval-Augmented Visual Language Model for Zero and Few-Shot Image Captioning | Zhuolin Yang et.al. | 2302.04858v1 | null |
2023-02-09 | One-shot Visual Imitation via Attributed Waypoints and Demonstration Augmentation | Matthew Chang et.al. | 2302.04856v1 | null |
2023-02-09 | SparseProp: Efficient Sparse Backpropagation for Faster Training of Neural Networks | Mahdi Nikdan et.al. | 2302.04852v1 | link |
2023-02-09 | Robot Synesthesia: A Sound and Emotion Guided AI Painter | Vihaan Misra et.al. | 2302.04850v1 | link |
2023-02-09 | Accurate and Interpretable Solution of the Inverse Rig for Realistic Blendshape Models with Quadratic Corrective Terms | Stevo Racković et.al. | 2302.04843v1 | null |
2023-02-09 | Is This Loss Informative? Speeding Up Textual Inversion with Deterministic Objective Evaluation | Anton Voronov et.al. | 2302.04841v1 | link |
2023-02-08 | PFGM++: Unlocking the Potential of Physics-Inspired Generative Models | Yilun Xu et.al. | 2302.04265v1 | link |
2023-02-08 | Learning How to Infer Partial MDPs for In-Context Adaptation and Exploration | Chentian Jiang et.al. | 2302.04250v1 | null |
2023-02-08 | Federated Minimax Optimization with Client Heterogeneity | Pranay Sharma et.al. | 2302.04249v1 | null |
2023-02-08 | Shortcut Detection with Variational Autoencoders | Nicolas M. Müller et.al. | 2302.04246v1 | link |
2023-02-07 | Long Horizon Temperature Scaling | Andy Shih et.al. | 2302.03686v1 | link |
2023-02-07 | Linear Partial Monitoring for Sequential Decision-Making: Algorithms, Regret Bounds and Applications | Johannes Kirschner et.al. | 2302.03683v1 | null |
2023-02-07 | Auditing Gender Presentation Differences in Text-to-Image Models | Yanzhe Zhang et.al. | 2302.03675v1 | link |
2023-02-07 | Proportionality in Approval-Based Participatory Budgeting | Markus Brill et.al. | 2302.03672v1 | null |
2023-02-07 | Hard Prompts Made Easy: Gradient-Based Discrete Optimization for Prompt Tuning and Discovery | Yuxin Wen et.al. | 2302.03668v1 | link |
2023-02-07 | HumanMAC: Masked Motion Completion for Human Motion Prediction | Ling-Hao Chen et.al. | 2302.03665v1 | link |
2023-02-07 | SDYN-GANs: Adversarial Learning Methods for Multistep Generative Models for General Order Stochastic Dynamics | Panos Stinis et.al. | 2302.03663v1 | null |
2023-02-06 | Zero-shot Image-to-Image Translation | Gaurav Parmar et.al. | 2302.03027v1 | link |
2023-02-06 | AIM: Adapting Image Models for Efficient Video Action Recognition | Taojiannan Yang et.al. | 2302.03024v1 | null |
2023-02-06 | Geometry of contact: contact planning for multi-legged robots via spin models duality | Baxi Chong et.al. | 2302.03019v1 | null |
2023-02-06 | Structure and Content-Guided Video Synthesis with Diffusion Models | Patrick Esser et.al. | 2302.03011v1 | null |
2023-02-06 | A novel Doppler backscattering (DBS) system to simultaneously monitor radio frequency plasma fluctuations and low frequency turbulence | S. Chowdhury et.al. | 2302.03009v1 | null |
2023-02-03 | Understanding the Issues, Their Causes and Solutions in Microservices Systems: An Empirical Study | Muhammad Waseem et.al. | 2302.01894v1 | null |
2023-02-03 | Enhancing Once-For-All: A Study on Parallel Blocks, Skip Connections and Early Exits | Simone Sarti et.al. | 2302.01888v1 | null |
2023-02-03 | Analyzing the impact of climate change on critical infrastructure from the scientific literature: A weakly supervised NLP approach | Tanwi Mallick et.al. | 2302.01887v1 | null |
2023-02-03 | LIDAR-based Stabilization, Navigation and Localization for UAVs Operating in Dark Indoor Environments | Matěj Petrl' ik et.al. | 2302.01883v1 | null |
2023-02-03 | IKEA-Manual: Seeing Shape Assembly Step by Step | Ruocheng Wang et.al. | 2302.01881v1 | null |
2023-02-02 | STEPS: Joint Self-supervised Nighttime Image Enhancement and Depth Estimation | Yupeng Zheng et.al. | 2302.01334v1 | link |
2023-02-02 | Bayesian Metric Learning for Uncertainty Quantification in Image Retrieval | Frederik Warburg et.al. | 2302.01332v1 | link |
2023-02-02 | SceneDreamer: Unbounded 3D Scene Generation from 2D Image Collections | Zhaoxi Chen et.al. | 2302.01330v1 | link |
2023-02-02 | Dreamix: Video Diffusion Models are General Video Editors | Eyal Molad et.al. | 2302.01329v1 | null |
2023-02-02 | $IC^3$ : Image Captioning by Committee Consensus | David M. Chan et.al. | 2302.01328v1 | link |
2023-02-02 | Randomized Greedy Learning for Non-monotone Stochastic Submodular Maximization Under Full-bandit Feedback | Fares Fourati et.al. | 2302.01324v1 | null |
2023-02-02 | Signatures for strong-field QED physics in the quantum limit of beamstrahlung | W. L. Zhang et.al. | 2302.01321v1 | null |
2023-02-01 | Improving Few-Shot Generalization by Exploring and Exploiting Auxiliary Data | Alon Albalak et.al. | 2302.00674v1 | link |
2023-02-01 | ‘Generative CI’ through Collective Response Systems | Aviv Ovadya et.al. | 2302.00672v1 | null |
2023-02-01 | Efficient Multi-Task Reinforcement Learning via Selective Behavior Sharing | Grace Zhang et.al. | 2302.00671v1 | null |
2023-02-01 | Stable Target Field for Reduced Variance Score Estimation in Diffusion Models | Yilun Xu et.al. | 2302.00670v1 | link |
2023-02-01 | Does Vision Accelerate Hierarchical Generalization of Neural Language Learners? | Tatsuki Kuribayashi et.al. | 2302.00667v1 | null |
2023-02-01 | Extrinsic Calibration of 2D mm-Wavelength Radar Pairs Using Ego-Velocity Estimates | Qilong Cheng et.al. | 2302.00660v1 | null |
2023-02-01 | Graph Neural Operators for Classification of Spatial Transcriptomics Data | Junaid Ahmed et.al. | 2302.00658v1 | null |
2023-01-31 | Reverse engineering adversarial attacks with fingerprints from adversarial examples | David Aaron Nicholson et.al. | 2301.13869v1 | null |
2023-01-31 | PADL: Language-Directed Physics-Based Character Control | Jordan Juravsky et.al. | 2301.13868v1 | link |
2023-01-31 | Zero-Memory Graph Exploration with Unknown Inports | Hans-Joachim Böckenhauer et.al. | 2301.13860v1 | null |
2023-01-31 | Interpreting Robustness Proofs of Deep Neural Networks | Debangshu Banerjee et.al. | 2301.13845v1 | null |
2023-01-31 | Do Multi-Document Summarization Models Synthesize? | Jay DeYoung et.al. | 2301.13844v1 | null |
2023-01-31 | RIS-Assisted Interference Mitigation for Uplink NOMA | Azadeh Tabeshnezhad et.al. | 2301.13841v1 | null |
2023-01-30 | Looped Transformers as Programmable Computers | Angeliki Giannou et.al. | 2301.13196v1 | null |
2023-01-30 | Adaptive Computation with Elastic Input Sequence | Fuzhao Xue et.al. | 2301.13195v1 | link |
2023-01-30 | Audio-Visual Segmentation with Semantics | Jinxing Zhou et.al. | 2301.13190v1 | link |
2023-01-30 | Extracting Training Data from Diffusion Models | Nicholas Carlini et.al. | 2301.13188v1 | null |
2023-01-30 | Weighted flow diffusion for local graph clustering with node attributes: an algorithm and statistical guarantees | Shenghao Yang et.al. | 2301.13187v1 | link |
2023-01-30 | Optimal Decision Tree Policies for Markov Decision Processes | Daniël Vos et.al. | 2301.13185v1 | link |
2023-01-27 | Incorporating Background Knowledge in Symbolic Regression using a Computer Algebra System | Charles Fox et.al. | 2301.11919v1 | null |
2023-01-27 | OccRob: Efficient SMT-Based Occlusion Robustness Verification of Deep Neural Networks | Xingwu Guo et.al. | 2301.11912v1 | null |
2023-01-27 | Multi-dimensional concept discovery (MCD): A unifying framework with completeness guarantees | Johanna Vielhaben et.al. | 2301.11911v1 | link |
2023-01-27 | Tree-structured Policy Planning with Learned Behavior Models | Yuxiao Chen et.al. | 2301.11902v1 | null |
2023-01-26 | Conservative Safety Monitors of Stochastic Dynamical Systems | Matthew Cleaveland et.al. | 2301.11330v1 | null |
2023-01-26 | MusicLM: Generating Music From Text | Andrea Agostinelli et.al. | 2301.11325v1 | null |
2023-01-26 | Joint Training of Deep Ensembles Fails Due to Learner Collusion | Alan Jeffares et.al. | 2301.11323v1 | null |
2023-01-26 | Cut and Learn for Unsupervised Object Detection and Instance Segmentation | Xudong Wang et.al. | 2301.11320v1 | link |
2023-01-26 | Learning Good Features to Transfer Across Tasks and Domains | Pierluigi Zama Ramirez et.al. | 2301.11310v1 | null |
2023-01-26 | SemSup-XC: Semantic Supervision for Zero and Few-shot Extreme Classification | Pranjal Aggarwal et.al. | 2301.11309v1 | link |
2023-01-26 | Neural Continuous-Discrete State Space Models for Irregularly-Sampled Time Series | Abdul Fatir Ansari et.al. | 2301.11308v1 | link |
2023-01-26 | DetectGPT: Zero-Shot Machine-Generated Text Detection using Probability Curvature | Eric Mitchell et.al. | 2301.11305v1 | link |
2023-01-25 | Fillers in Spoken Language Understanding: Computational and Psycholinguistic Perspectives | Tanvi Dinkar et.al. | 2301.10761v1 | null |
2023-01-25 | Efficient Flow-Guided Multi-frame De-fencing | Stavros Tsogkas et.al. | 2301.10759v1 | null |
2023-01-25 | Room-Temperature Sputtered Ultralow-loss Silicon Nitride for Hybrid Photonic Integration | Shuangyou Zhang et.al. | 2301.10758v1 | null |
2023-01-25 | Generating large-scale network analyses of scientific landscapes in seconds using Dimensions on Google BigQuery | Michele Pasin et.al. | 2301.10736v1 | null |
2023-01-25 | The Synchronic Web | Thien-Nam Dinh et.al. | 2301.10733v1 | null |
2023-01-24 | A Watermark for Large Language Models | John Kirchenbauer et.al. | 2301.10226v1 | link |
2023-01-24 | Evolution of cooperation under a generalized death-birth process | Chaoqian Wang et.al. | 2301.10205v1 | null |
2023-01-24 | A general epidemic model and its application to mask design considering different preferences towards masks | Chaoqian Wang et.al. | 2301.10202v1 | null |
2023-01-23 | InfiniCity: Infinite-Scale City Synthesis | Chieh Hubert Lin et.al. | 2301.09637v1 | null |
2023-01-23 | Feature construction using explanations of individual predictions | Boštjan Vouk et.al. | 2301.09631v1 | null |
2023-01-23 | Tracking the industrial growth of modern China with high-resolution panchromatic imagery: A sequential convolutional approach | Ethan Brewer et.al. | 2301.09620v1 | null |
2023-01-23 | Asymptotic Convergence and Performance of Multi-Agent Q-Learning Dynamics | Aamal Abbas Hussain et.al. | 2301.09619v1 | null |
2023-01-20 | The stochastic digital human is now enrolling for in silico imaging trials – Methods and tools for generating digital cohorts | A Badano et.al. | 2301.08719v1 | null |
2023-01-20 | Massively Parallel Genetic Optimization through Asynchronous Propagation of Populations | Oskar Taubert et.al. | 2301.08713v1 | link |
2023-01-19 | Multiview Compressive Coding for 3D Reconstruction | Chao-Yuan Wu et.al. | 2301.08247v1 | link |
2023-01-19 | Booster: a Benchmark for Depth from Images of Specular and Transparent Surfaces | Pierluigi Zama Ramirez et.al. | 2301.08245v1 | null |
2023-01-19 | Self-Supervised Learning from Images with a Joint-Embedding Predictive Architecture | Mahmoud Assran et.al. | 2301.08243v1 | link |
2023-01-19 | Radiation-induced secondary emissions in solid-state devices as a possible contribution to quasiparticle poisoning of superconducting circuits | Francisco Ponce et.al. | 2301.08239v1 | null |
2023-01-18 | Robust Zero-crossings Detection in Noisy Signals using Topological Signal Processing | Sunia Tanweer et.al. | 2301.07703v1 | null |
2023-01-18 | Learning 3D-aware Image Synthesis with Unknown Pose Distribution | Zifan Shi et.al. | 2301.07702v1 | null |
2023-01-18 | Prony-Based Super-Resolution Phase Retrieval of Sparse, Multivariate Signals | Robert Beinert et.al. | 2301.07696v1 | null |
2023-01-18 | Private Federated Submodel Learning via Private Set Union | Zhusheng Wang et.al. | 2301.07686v1 | null |
2023-01-18 | SFQEDtoolkit: a high-performance library for the accurate modeling of strong-field QED processes in PIC and Monte Carlo codes | Samuele Montefiori et.al. | 2301.07684v1 | link |
2023-01-18 | OmniObject3D: Large-Vocabulary 3D Object Dataset for Realistic Perception, Reconstruction and Generation | Tong Wu et.al. | 2301.07525v1 | null |
2023-01-17 | Three Dimensional Odd Viscosity in Ferrofluids with Vorticity-Magnetization Coupling | Dylan Reynolds et.al. | 2301.07096v1 | null |
2023-01-17 | On the State of German (Abstractive) Text Summarization | Dennis Aumiller et.al. | 2301.07095v1 | link |
2023-01-17 | Learning Customized Visual Models with Retrieval-Augmented Knowledge | Haotian Liu et.al. | 2301.07094v1 | link |
2023-01-17 | GLIGEN: Open-Set Grounded Text-to-Image Generation | Yuheng Li et.al. | 2301.07093v1 | link |
2023-01-17 | Vision Learners Meet Web Image-Text Pairs | Bingchen Zhao et.al. | 2301.07088v1 | null |
2023-01-17 | MooseNet: A trainable metric for synthesized speech with plda backend | Ondřej Plátek et.al. | 2301.07087v1 | link |
2023-01-17 | Transformers as Algorithms: Generalization and Implicit Model Selection in In-context Learning | Yingcong Li et.al. | 2301.07067v1 | link |
2023-01-13 | Non-Stochastic CDF Estimation Using Threshold Queries | Princewill Okoroafor et.al. | 2301.05682v1 | null |
2023-01-12 | See, Think, Confirm: Interactive Prompting Between Vision and Language Models for Knowledge-based Visual Reasoning | Zhenfang Chen et.al. | 2301.05226v1 | null |
2023-01-12 | Domain Expansion of Image Generators | Yotam Nitzan et.al. | 2301.05225v1 | null |
2023-01-12 | Guiding Text-to-Image Diffusion Model Towards Grounded Generation | Ziyi Li et.al. | 2301.05221v1 | null |
2023-01-12 | Adversarial Adaptation for French Named Entity Recognition | Arjun Choudhry et.al. | 2301.05220v1 | link |
2023-01-12 | NDNSD: Service Publishing and Discovery in NDN | Saurab Dulal et.al. | 2301.05218v1 | null |
(<a href=#Updated-on-20240404>back to top</a>)
Publish Date | Title | Authors | Code | |
---|---|---|---|---|
2024-04-03 | Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction | Keyu Tian et.al. | 2404.02905v1 | link |
2024-04-03 | LidarDM: Generative LiDAR Simulation in a Generated World | Vlas Zyrianov et.al. | 2404.02903v1 | null |
2024-04-03 | DeiT-LT Distillation Strikes Back for Vision Transformer Training on Long-Tailed Datasets | Harsh Rangwani et.al. | 2404.02900v1 | link |
2024-04-03 | MatAtlas: Text-driven Consistent Geometry Texturing and Material Assignment | Duygu Ceylan et.al. | 2404.02899v1 | null |
2024-04-03 | A Mean Field Game Model for Timely Computation in Edge Computing Systems | Shubham Aggarwal et.al. | 2404.02898v1 | null |
2024-04-03 | Deep Image Composition Meets Image Forgery | Eren Tahir et.al. | 2404.02897v1 | link |
2024-04-03 | ChatGLM-Math: Improving Math Problem-Solving in Large Language Models with a Self-Critique Pipeline | Yifan Xu et.al. | 2404.02893v1 | null |
2024-04-03 | PoCo: Point Context Cluster for RGBD Indoor Place Recognition | Jing Liang et.al. | 2404.02885v1 | null |
2024-04-02 | Segment Any 3D Object with Language | Seungjun Lee et.al. | 2404.02157v1 | null |
2024-04-02 | Dynamic Pre-training: Towards Efficient and Scalable All-in-One Image Restoration | Akshay Dudhane et.al. | 2404.02154v1 | null |
2024-04-02 | GeneAvatar: Generic Expression-Aware Volumetric Head Avatar Editing from a Single Image | Chong Bao et.al. | 2404.02152v1 | null |
2024-04-02 | Diffusion $^2$ : Dynamic 3D Content Generation via Score Composition of Orthogonal Diffusion Models | Zeyu Yang et.al. | 2404.02148v1 | link |
2024-04-02 | Harder, Better, Faster, Stronger: Interactive Visualization for Human-Centered AI Tools | Md Naimul Hoque et.al. | 2404.02147v1 | null |
2024-04-02 | Iterated Learning Improves Compositionality in Large Vision-Language Models | Chenhao Zheng et.al. | 2404.02145v1 | null |
2024-04-02 | Multiparametric quantification and visualization of liver fat using ultrasound | Jihye Baek et.al. | 2404.02143v1 | null |
2024-03-29 | Gecko: Versatile Text Embeddings Distilled from Large Language Models | Jinhyuk Lee et.al. | 2403.20327v1 | null |
2024-03-29 | Shaving Logs via Large Sieve Inequality: Faster Algorithms for Sparse Convolution and More | Ce Jin et.al. | 2403.20326v1 | null |
2024-03-29 | Structure and Dynamics of Magneto-Inertial, Differentially Rotating Laboratory Plasmas | V. Valenzuela-Villaseca et.al. | 2403.20321v1 | null |
2024-03-29 | SeaBird: Segmentation in Bird’s View with Dice Loss Improves Monocular 3D Detection of Large Objects | Abhinav Kumar et.al. | 2403.20318v1 | link |
2024-03-29 | Convolutional Prompting meets Language Models for Continual Learning | Anurag Roy et.al. | 2403.20317v1 | null |
2024-03-29 | Optimal Communication for Classic Functions in the Coordinator Model and Beyond | Hossein Esfandiari et.al. | 2403.20307v1 | null |
2024-03-28 | GaussianCube: Structuring Gaussian Splatting using Optimal Transport for 3D Generative Modeling | Bowen Zhang et.al. | 2403.19655v1 | null |
2024-03-28 | Detecting Image Attribution for Text-to-Image Diffusion Models in RGB and Beyond | Katherine Xu et.al. | 2403.19653v1 | link |
2024-03-28 | InterDreamer: Zero-Shot Text to 3D Dynamic Human-Object Interaction | Sirui Xu et.al. | 2403.19652v1 | null |
2024-03-28 | GraspXL: Generating Grasping Motions for Diverse Objects at Scale | Hui Zhang et.al. | 2403.19649v1 | null |
2024-03-28 | Sparse Feature Circuits: Discovering and Editing Interpretable Causal Graphs in Language Models | Samuel Marks et.al. | 2403.19647v1 | link |
2024-03-28 | GANTASTIC: GAN-based Transfer of Interpretable Directions for Disentangled Image Editing in Text-to-Image Diffusion Models | Yusuf Dalva et.al. | 2403.19645v1 | null |
2024-03-27 | Real Acoustic Fields: An Audio-Visual Room Acoustics Dataset and Benchmark | Ziyang Chen et.al. | 2403.18821v1 | null |
2024-03-27 | MetaCap: Meta-learning Priors from Multi-View Imagery for Sparse-view Human Performance Capture and Rendering | Guoxing Sun et.al. | 2403.18820v1 | null |
2024-03-27 | ObjectDrop: Bootstrapping Counterfactuals for Photorealistic Object Removal and Insertion | Daniel Winter et.al. | 2403.18818v1 | null |
2024-03-27 | Garment3DGen: 3D Garment Stylization and Texture Generation | Nikolaos Sarafianos et.al. | 2403.18816v1 | null |
2024-03-27 | Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models | Yanwei Li et.al. | 2403.18814v1 | link |
2024-03-27 | Duolando: Follower GPT with Off-Policy Reinforcement Learning for Dance Accompaniment | Li Siyao et.al. | 2403.18811v1 | null |
2024-03-28 | ECoDepth: Effective Conditioning of Diffusion Models for Monocular Depth Estimation | Suraj Patni et.al. | 2403.18807v2 | link |
2024-03-26 | ConvoFusion: Multi-Modal Conversational Diffusion for Co-Speech Gesture Synthesis | Muhammad Hamza Mughal et.al. | 2403.17936v1 | null |
2024-03-26 | OmniVid: A Generative Framework for Universal Video Understanding | Junke Wang et.al. | 2403.17935v1 | link |
2024-03-26 | SLEDGE: Synthesizing Simulation Environments for Driving Agents with Generative Models | Kashyap Chitta et.al. | 2403.17933v1 | null |
2024-03-26 | MAGIS: LLM-Based Multi-Agent Framework for GitHub Issue Resolution | Wei Tao et.al. | 2403.17927v1 | null |
2024-03-26 | AID: Attention Interpolation of Text-to-Image Diffusion | Qiyuan He et.al. | 2403.17924v1 | link |
2024-03-26 | The Need for Speed: Pruning Transformers with One Recipe | Samir Khaki et.al. | 2403.17921v1 | link |
2024-03-26 | TC4D: Trajectory-Conditioned Text-to-4D Generation | Sherwin Bahmani et.al. | 2403.17920v1 | null |
2024-03-26 | AgentStudio: A Toolkit for Building General Virtual Agents | Longtao Zheng et.al. | 2403.17918v1 | null |
2024-03-25 | Exploiting Priors from 3D Diffusion Models for RGB-Based One-Shot View Planning | Sicong Pan et.al. | 2403.16803v1 | null |
2024-03-25 | Iterative Refinement of Project-Level Code Context for Precise Code Generation with Compiler Feedback | Zhangqian Bi et.al. | 2403.16792v1 | null |
2024-03-25 | Iso-Diffusion: Improving Diffusion Probabilistic Models Using the Isotropy of the Additive Gaussian Noise | Dilum Fernando et.al. | 2403.16790v1 | null |
2024-03-25 | HPL-ESS: Hybrid Pseudo-Labeling for Unsupervised Event-based Semantic Segmentation | Linglin Jing et.al. | 2403.16788v1 | null |
2024-03-25 | Creating a Digital Twin of Spinal Surgery: A Proof of Concept | Jonas Hein et.al. | 2403.16736v1 | null |
2024-03-25 | Improving Diffusion Models’s Data-Corruption Resistance using Scheduled Pseudo-Huber Loss | Artem Khrapov et.al. | 2403.16728v1 | link |
2024-03-22 | DiffusionMTL: Learning Multi-Task Denoising Diffusion Model from Partially Annotated Data | Hanrong Ye et.al. | 2403.15389v1 | null |
2024-03-22 | LATTE3D: Large-scale Amortized Text-To-Enhanced3D Synthesis | Kevin Xie et.al. | 2403.15385v1 | null |
2024-03-22 | ThemeStation: Generating Theme-Aware 3D Assets from Few Exemplars | Zhenwei Wang et.al. | 2403.15383v1 | null |
2024-03-22 | DragAPart: Learning a Part-Level Motion Prior for Articulated Objects | Ruining Li et.al. | 2403.15382v1 | null |
2024-03-22 | Long-CLIP: Unlocking the Long-Text Capability of CLIP | Beichen Zhang et.al. | 2403.15378v1 | link |
2024-03-22 | InternVideo2: Scaling Video Foundation Models for Multimodal Video Understanding | Yi Wang et.al. | 2403.15377v1 | link |
2024-03-22 | A Modular, End-to-End Next-Generation Network Testbed: Towards a Fully Automated Network Management Platform | Ali Chouman et.al. | 2403.15376v1 | null |
2024-03-21 | Zero-Shot Multi-Object Shape Completion | Shun Iwase et.al. | 2403.14628v1 | null |
2024-03-21 | MVSplat: Efficient 3D Gaussian Splatting from Sparse Multi-View Images | Yuedong Chen et.al. | 2403.14627v1 | link |
2024-03-21 | Simplified Diffusion Schrödinger Bridge | Zhicong Tang et.al. | 2403.14623v1 | link |
2024-03-21 | GRM: Large Gaussian Reconstruction Model for Efficient 3D Reconstruction and Generation | Yinghao Xu et.al. | 2403.14621v1 | link |
2024-03-21 | ClusteringSDF: Self-Organized Neural Implicit Surfaces for 3D Decomposition | Tianhao Wu et.al. | 2403.14619v1 | null |
2024-03-21 | Videoshop: Localized Semantic Video Editing with Noise-Extrapolated Diffusion Inversion | Xiang Fan et.al. | 2403.14617v1 | null |
2024-03-21 | Hierarchical Text-to-Vision Self Supervised Alignment for Improved Histopathology Representation Learning | Hasindri Watawana et.al. | 2403.14616v1 | link |
2024-03-21 | DreamReward: Text-to-3D Generation with Human Preference | Junliang Ye et.al. | 2403.14613v1 | null |
2024-03-21 | Explorative Inbetweening of Time and Space | Haiwen Feng et.al. | 2403.14611v1 | null |
2024-03-20 | On Pretraining Data Diversity for Self-Supervised Learning | Hasan Abed Al Kader Hammoud et.al. | 2403.13808v1 | link |
2024-03-20 | Editing Massive Concepts in Text-to-Image Diffusion Models | Tianwei Xiong et.al. | 2403.13807v1 | link |
2024-03-20 | Learning from Models and Data for Visual Grounding | Ruozhen He et.al. | 2403.13804v1 | null |
2024-03-20 | Bounding Box Stability against Feature Dropout Reflects Detector Generalization across Environments | Yang Yang et.al. | 2403.13803v1 | link |
2024-03-20 | ZigMa: Zigzag Mamba Diffusion Model | Vincent Tao Hu et.al. | 2403.13802v1 | link |
2024-03-20 | Natural Language as Polices: Reasoning for Coordinate-Level Embodied Control with LLMs | Yusuke Mikami et.al. | 2403.13801v1 | link |
2024-03-20 | TimeRewind: Rewinding Time with Image-and-Events Video Diffusion | Jingxi Chen et.al. | 2403.13800v1 | null |
2024-03-20 | Reverse Training to Nurse the Reversal Curse | Olga Golovneva et.al. | 2403.13799v1 | null |
2024-03-20 | Hierarchical NeuroSymbolic Approach for Action Quality Assessment | Lauren Okamoto et.al. | 2403.13798v1 | null |
2024-03-20 | Bridge the Modality and Capacity Gaps in Vision-Language Model Selection | Chao Yi et.al. | 2403.13797v1 | null |
2024-03-19 | LLMLingua-2: Data Distillation for Efficient and Faithful Task-Agnostic Prompt Compression | Zhuoshi Pan et.al. | 2403.12968v1 | link |
2024-03-19 | Wear-Any-Way: Manipulable Virtual Try-on via Sparse Correspondence Alignment | Mengting Chen et.al. | 2403.12965v1 | null |
2024-03-19 | Negative Yields Positive: Unified Dual-Path Adapter for Vision-Language Models | Ce Zhang et.al. | 2403.12964v1 | link |
2024-03-19 | FouriScale: A Frequency Perspective on Training-Free High-Resolution Image Synthesis | Linjiang Huang et.al. | 2403.12963v1 | link |
2024-03-19 | TexTile: A Differentiable Metric for Texture Tileability | Carlos Rodriguez-Pardo et.al. | 2403.12961v1 | null |
2024-03-19 | FaceXFormer: A Unified Transformer for Facial Analysis | Kartik Narayan et.al. | 2403.12960v1 | link |
2024-03-19 | GVGEN: Text-to-3D Generation with Volumetric Representation | Xianglong He et.al. | 2403.12957v1 | null |
2024-03-19 | Abiogenesis: a possible quantum interpretation of the telepoietic conjecture | Vittorio Cocchi et.al. | 2403.12955v1 | null |
2024-03-19 | Just Shift It: Test-Time Prototype Shifting for Zero-Shot Generalization with Vision-Language Models | Elaine Sui et.al. | 2403.12952v1 | link |
2024-03-18 | RIS-aided Single-frequency 3D Imaging by Exploiting Multi-view Image Correlations | Yixuan Huang et.al. | 2403.11764v1 | null |
2024-03-19 | Full-Duplex MU-MIMO Systems with Coarse Quantization: How Many Bits Do We Need? | Seunghyeong Yoo et.al. | 2403.11762v2 | null |
2024-03-18 | Why E.T. Can’t Phone Home: A Global View on IP-based Geoblocking at VoWiFi | Gabriel Karl Gegenhuber et.al. | 2403.11759v1 | null |
2024-03-18 | Meta-Prompting for Automating Zero-shot Visual Recognition with LLMs | M. Jehanzeb Mirza et.al. | 2403.11755v1 | link |
2024-03-18 | Asymptotically Optimal Codes for $(t,s)$ -Burst Error | Yubo Sun et.al. | 2403.11750v1 | null |
2024-03-18 | Embedded Named Entity Recognition using Probing Classifiers | Nicholas Popovič et.al. | 2403.11747v1 | null |
2024-03-18 | Revisiting Tensor Basis Neural Networks for Reynolds stress modeling: application to plane channel and square duct flows | Jiayi Cai et.al. | 2403.11746v1 | null |
2024-03-18 | Matter and cosmogenesis in Kant’s Theory of the Heavens | Garance Benoit et.al. | 2403.11710v1 | null |
2024-03-18 | Significant impact of light-matter strong coupling on chiral nonlinear optical effect | Daichi Okada et.al. | 2403.11709v1 | null |
2024-03-18 | Generalized Multi-Source Inference for Text Conditioned Music Diffusion Models | Emilian Postolache et.al. | 2403.11706v1 | link |
2024-03-18 | Virbo: Multimodal Multilingual Avatar Video Generation in Digital Marketing | Juan Zhang et.al. | 2403.11700v1 | null |
2024-03-18 | Urban Scene Diffusion through Semantic Occupancy Map | Junge Zhang et.al. | 2403.11697v1 | null |
2024-03-18 | Generalization error of spectral algorithms | Maksim Velikanov et.al. | 2403.11696v1 | null |
2024-03-18 | Beamforming Design for Semantic-Bit Coexisting Communication System | Maojun Zhang et.al. | 2403.11693v1 | null |
2024-03-15 | P-MapNet: Far-seeing Map Generator Enhanced by both SDMap and HDMap Priors | Zhou Jiang et.al. | 2403.10521v1 | null |
2024-03-15 | Lodge: A Coarse to Fine Diffusion Network for Long Dance Generation Guided by the Characteristic Dance Primitives | Ronghui Li et.al. | 2403.10518v1 | link |
2024-03-15 | FeatUp: A Model-Agnostic Framework for Features at Any Resolution | Stephanie Fu et.al. | 2403.10516v1 | link |
2024-03-15 | A Novel Framework for Multi-Person Temporal Gaze Following and Social Gaze Prediction | Anshul Gupta et.al. | 2403.10511v1 | null |
2024-03-15 | Demystifying Faulty Code with LLM: Step-by-Step Reasoning for Explainable Fault Localization | Ratnadira Widyasari et.al. | 2403.10507v1 | null |
2024-03-15 | Belief Change based on Knowledge Measures | Umberto Straccia et.al. | 2403.10502v1 | null |
2024-03-14 | SCP-Diff: Photo-Realistic Semantic Image Synthesis with Spatial-Categorical Joint Prior | Huan-ang Gao et.al. | 2403.09638v1 | null |
2024-03-14 | GaussianGrasper: 3D Language Gaussian Splatting for Open-vocabulary Robotic Grasping | Yuhang Zheng et.al. | 2403.09637v1 | link |
2024-03-14 | Dynamic Memory Compression: Retrofitting LLMs for Accelerated Inference | Piotr Nawrot et.al. | 2403.09636v1 | null |
2024-03-14 | OneTracker: Unifying Visual Object Tracking with Foundation Models and Efficient Tuning | Lingyi Hong et.al. | 2403.09634v1 | null |
2024-03-14 | Holo-Relighting: Controllable Volumetric Portrait Relighting from a Single Image | Yiqun Mei et.al. | 2403.09632v1 | null |
2024-03-14 | 3D-VLA: A 3D Vision-Language-Action Generative World Model | Haoyu Zhen et.al. | 2403.09631v1 | null |
2024-03-14 | Generalized Predictive Model for Autonomous Driving | Jiazhi Yang et.al. | 2403.09630v1 | link |
2024-03-14 | Quiet-STaR: Language Models Can Teach Themselves to Think Before Speaking | Eric Zelikman et.al. | 2403.09629v1 | link |
2024-03-14 | Make-Your-3D: Fast and Consistent Subject-Driven 3D Content Generation | Fangfu Liu et.al. | 2403.09625v1 | null |
2024-03-14 | Glyph-ByT5: A Customized Text Encoder for Accurate Visual Text Rendering | Zeyu Liu et.al. | 2403.09622v1 | null |
2024-03-13 | FastMAC: Stochastic Spectral Sampling of Correspondence Graph | Yifei Zhang et.al. | 2403.08770v1 | link |
2024-03-13 | VLOGGER: Multimodal Diffusion for Embodied Avatar Synthesis | Enric Corona et.al. | 2403.08764v1 | null |
2024-03-13 | A local model for the optical energy and momentum transfer in dielectric media and the microscopic origin of Abraham’s force density | B. Anghinoni et.al. | 2403.08752v1 | null |
2024-03-13 | iCONTRA: Toward Thematic Collection Design Via Interactive Concept Transfer | Dinh-Khoi Vo et.al. | 2403.08746v1 | link |
2024-03-12 | Rethinking Generative Large Language Model Evaluation for Semantic Comprehension | Fangyun Wei et.al. | 2403.07872v1 | null |
2024-03-12 | TeleMoMa: A Modular and Versatile Teleoperation System for Mobile Manipulation | Shivin Dass et.al. | 2403.07869v1 | null |
2024-03-12 | Exploring Safety Generalization Challenges of Large Language Models via Code | Qibing Ren et.al. | 2403.07865v1 | null |
2024-03-12 | Bridging Different Language Models and Generative Vision Models for Text-to-Image Generation | Shihao Zhao et.al. | 2403.07860v1 | link |
2024-03-12 | Fairness Feedback Loops: Training on Synthetic Data Amplifies Bias | Sierra Wyllie et.al. | 2403.07857v1 | null |
2024-03-12 | Quantifying and Mitigating Privacy Risks for Tabular Generative Models | Chaoyi Zhu et.al. | 2403.07842v1 | null |
2024-03-11 | A representation-learning game for classes of prediction tasks | Neria Uzan et.al. | 2403.06971v1 | null |
2024-03-11 | The pitfalls of next-token prediction | Gregor Bachmann et.al. | 2403.06963v1 | link |
2024-03-11 | Optimizing Latent Graph Representations of Surgical Scenes for Zero-Shot Domain Transfer | Siddhant Satyanaik et.al. | 2403.06953v1 | null |
2024-03-11 | SELMA: Learning and Merging Skill-Specific Text-to-Image Experts with Auto-Generated Data | Jialu Li et.al. | 2403.06952v1 | null |
2024-03-08 | Tell, Don’t Show!: Language Guidance Eases Transfer Across Domains in Images and Videos | Tarun Kalluri et.al. | 2403.05535v1 | null |
2024-03-08 | Tune without Validation: Searching for Learning Rate and Weight Decay on Training Sets | Lorenzo Brigato et.al. | 2403.05532v1 | null |
2024-03-08 | Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context | Machel Reid et.al. | 2403.05530v1 | null |
2024-03-08 | The Computational Complexity of Learning Gaussian Single-Index Models | Alex Damian et.al. | 2403.05529v1 | null |
2024-03-08 | GEAR: An Efficient KV Cache Compression Recipefor Near-Lossless Generative Inference of LLM | Hao Kang et.al. | 2403.05527v1 | link |
2024-03-08 | Beyond Finite Data: Towards Data-free Out-of-distribution Generalization via Extrapola | Yijiang Li et.al. | 2403.05523v1 | null |
2024-03-08 | Bias-Augmented Consistency Training Reduces Biased Reasoning in Chain-of-Thought | James Chua et.al. | 2403.05518v1 | link |
2024-03-07 | BloomGML: Graph Machine Learning through the Lens of Bilevel Optimization | Amber Yijia Zheng et.al. | 2403.04763v1 | link |
2024-03-07 | Lifelong Intelligence Beyond the Edge using Hyperdimensional Computing | Xiaofan Yu et.al. | 2403.04759v1 | link |
2024-03-07 | KnowledgeVIS: Interpreting Language Models by Comparing Fill-in-the-Blank Prompts | Adam Coscia et.al. | 2403.04758v1 | link |
2024-03-07 | Preliminary Guidelines For Combining Data Integration and Visual Data Analysis | Adam Coscia et.al. | 2403.04757v1 | link |
2024-03-07 | Mechanism for Decision-aware Collaborative Federated Learning: A Pitfall of Shapley Values | Meng Qi et.al. | 2403.04753v1 | null |
2024-03-07 | JAX-SPH: A Differentiable Smoothed Particle Hydrodynamics Framework | Artur P. Toshev et.al. | 2403.04750v1 | link |
2024-03-07 | A General Calibrated Regret Metric for Detecting and Mitigating Human-Robot Interaction Failures | Kensuke Nakamura et.al. | 2403.04745v1 | null |
2024-03-06 | Backtracing: Retrieving the Cause of the Query | Rose E. Wang et.al. | 2403.03956v1 | link |
2024-03-06 | 3D Diffusion Policy | Yanjie Ze et.al. | 2403.03954v1 | link |
2024-03-06 | Bridging Language and Items for Retrieval and Recommendation | Yupeng Hou et.al. | 2403.03952v1 | link |
2024-03-06 | Can Audio Reveal Music Performance Difficulty? Insights from the Piano Syllabus Dataset | Pedro Ramoneda et.al. | 2403.03947v1 | null |
2024-03-06 | Separate and Detailed Treatment of Absolute Signal and Noise Enables NMR Under Adverse Circumstances | A Guinness et.al. | 2403.03943v1 | null |
2024-03-06 | The Heuristic Core: Understanding Subnetwork Generalization in Pretrained Language Models | Adithya Bhaskar et.al. | 2403.03942v1 | link |
2024-03-06 | GUIDE: Guidance-based Incremental Learning with Diffusion Models | Bartosz Cywiński et.al. | 2403.03938v1 | link |
2024-03-05 | LC-Tsalis-INF: Generalized Best-of-Both-Worlds Linear Contextual Bandits | Masahiro Kato et.al. | 2403.03219v1 | null |
2024-03-05 | The WMDP Benchmark: Measuring and Reducing Malicious Use With Unlearning | Nathaniel Li et.al. | 2403.03218v1 | null |
2024-03-05 | Self-supervised 3D Patient Modeling with Multi-modal Attentive Fusion | Meng Zheng et.al. | 2403.03217v1 | null |
2024-03-05 | A Safety-Critical Framework for UGVs in Complex Environments: A Data-Driven Discrepancy-Aware Approach | Skylar X. Wei et.al. | 2403.03215v1 | null |
2024-03-05 | Scaling Rectified Flow Transformers for High-Resolution Image Synthesis | Patrick Esser et.al. | 2403.03206v1 | null |
2024-03-05 | CLEVR-POC: Reasoning-Intensive Visual Question Answering in Partially Observable Environments | Savitha Sam Abraham et.al. | 2403.03203v1 | null |
2024-03-03 | Bandit Profit-maximization for Targeted Marketing | Joon Suk Huh et.al. | 2403.01361v1 | null |
2024-03-03 | ModelWriter: Text & Model-Synchronized Document Engineering Platform | Ferhat Erata et.al. | 2403.01359v1 | null |
2024-03-03 | Improving Uncertainty Sampling with Bell Curve Weight Function | Zan-Kai Chong et.al. | 2403.01352v1 | null |
2024-03-03 | Efficient FIR filtering with Bit Layer Multiply Accumulator | Vincenzo Liguori et.al. | 2403.01351v1 | null |
2024-03-02 | ShapeBoost: Boosting Human Shape Estimation with Part-Based Parameterization and Clothing-Preserving Augmentation | Siyuan Bian et.al. | 2403.01345v1 | null |
2024-02-29 | DistriFusion: Distributed Parallel Inference for High-Resolution Diffusion Models | Muyang Li et.al. | 2402.19481v1 | link |
2024-02-29 | Panda-70M: Captioning 70M Videos with Multiple Cross-Modality Teachers | Tsai-Shien Chen et.al. | 2402.19479v1 | null |
2024-02-29 | Learning a Generalized Physical Face Model From Data | Lingchen Yang et.al. | 2402.19477v1 | null |
2024-02-29 | The Counterfeit Conundrum: Can Code Language Models Grasp the Nuances of Their Incorrect Generations? | Alex Gu et.al. | 2402.19475v1 | null |
2024-02-29 | The All-Seeing Project V2: Towards General Relation Comprehension of the Open World | Weiyun Wang et.al. | 2402.19474v1 | link |
2024-02-29 | Retrieval-Augmented Generation for AI-Generated Content: A Survey | Penghao Zhao et.al. | 2402.19473v1 | link |
2024-02-29 | Loose LIPS Sink Ships: Asking Questions in Battleship with Language-Informed Program Sampling | Gabriel Grand et.al. | 2402.19471v1 | null |
2024-02-29 | Humanoid Locomotion as Next Token Prediction | Ilija Radosavovic et.al. | 2402.19469v1 | null |
2024-02-28 | UniMODE: Unified Monocular 3D Object Detection | Zhuoling Li et.al. | 2402.18573v1 | null |
2024-02-28 | Arithmetic Control of LLMs for Diverse User Preferences: Directional Preference Alignment with Multi-Objective Rewards | Haoxiang Wang et.al. | 2402.18571v1 | link |
2024-02-28 | Diffusion Language Models Are Versatile Protein Learners | Xinyou Wang et.al. | 2402.18567v1 | null |
2024-02-28 | Approaching Human-Level Forecasting with Language Models | Danny Halawi et.al. | 2402.18563v1 | null |
2024-02-27 | The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits | Shuming Ma et.al. | 2402.17764v1 | null |
2024-02-27 | Reducing Unnecessary Alerts in Pedestrian Protection Systems Based on P2V Communications | Ignacio Soto et.al. | 2402.17763v1 | null |
2024-02-27 | Towards Optimal Learning of Language Models | Yuxian Gu et.al. | 2402.17759v1 | null |
2024-02-27 | ADL4D: Towards A Contextually Rich Dataset for 4D Activities of Daily Living | Marsil Zakour et.al. | 2402.17758v1 | null |
2024-02-27 | Evaluating Very Long-Term Conversational Memory of LLM Agents | Adyasha Maharana et.al. | 2402.17753v1 | null |
2024-02-26 | Pre-training Cross-lingual Open Domain Question Answering with Large-scale Synthetic Supervision | Fan Jiang et.al. | 2402.16508v1 | link |
2024-02-26 | Stochastic Conditional Diffusion Models for Semantic Image Synthesis | Juyeon Ko et.al. | 2402.16506v1 | null |
2024-02-26 | SAND: Decoupling Sanitization from Fuzzing for Low Overhead | Ziqiao Kong et.al. | 2402.16497v1 | null |
2024-02-26 | Intelligent Known and Novel Aircraft Recognition – A Shift from Classification to Similarity Learning for Combat Identification | Ahmad Saeed et.al. | 2402.16486v1 | null |
2024-02-23 | Seamless Human Motion Composition with Blended Positional Encodings | German Barquero et.al. | 2402.15509v1 | link |
2024-02-23 | AgentOhana: Design Unified Data and Training Pipeline for Effective Agent Learning | Jianguo Zhang et.al. | 2402.15506v1 | link |
2024-02-23 | Co-Supervised Learning: Improving Weak-to-Strong Generalization with Hierarchical Mixture of Experts | Yuejiang Liu et.al. | 2402.15505v1 | null |
2024-02-23 | Gen4Gen: Generative Data Pipeline for Generative Multi-Concept Composition | Chun-Hsiao Yeh et.al. | 2402.15504v1 | link |
2024-02-23 | API-BLEND: A Comprehensive Corpora for Training and Benchmarking API LLMs | Kinjal Basu et.al. | 2402.15491v1 | null |
2024-02-22 | PALO: A Polyglot Large Multimodal Model for 5B People | Muhammad Maaz et.al. | 2402.14818v1 | link |
2024-02-22 | Cameras as Rays: Pose Estimation via Ray Diffusion | Jason Y. Zhang et.al. | 2402.14817v1 | null |
2024-02-22 | WeakSAM: Segment Anything Meets Weakly-supervised Instance-level Recognition | Lianghui Zhu et.al. | 2402.14812v1 | link |
2024-02-22 | Fine-Tuning Enhances Existing Mechanisms: A Case Study on Entity Tracking | Nikhil Prakash et.al. | 2402.14811v1 | null |
2024-02-22 | GeneOH Diffusion: Towards Generalizable Hand-Object Interaction Denoising via Denoising Diffusion | Xueyi Liu et.al. | 2402.14810v1 | link |
2024-02-22 | CriticBench: Benchmarking LLMs for Critique-Correct Reasoning | Zicheng Lin et.al. | 2402.14809v1 | link |
2024-02-22 | RelayAttention for Efficient Large Language Model Serving with Long System Prompts | Lei Zhu et.al. | 2402.14808v1 | link |
2024-02-22 | A Decision-Language Model (DLM) for Dynamic Restless Multi-Armed Bandit Tasks in Public Health | Nikhil Behari et.al. | 2402.14807v1 | null |
2024-02-22 | Identifying Multiple Personalities in Large Language Models with External Evaluation | Xiaoyang Song et.al. | 2402.14805v1 | null |
2024-02-21 | D-Flow: Differentiating through Flows for Controlled Generation | Heli Ben-Hamu et.al. | 2402.14017v1 | null |
2024-02-21 | Corrective Machine Unlearning | Shashwat Goel et.al. | 2402.14015v1 | link |
2024-02-21 | Geometry-Informed Neural Networks | Arturs Berzins et.al. | 2402.14009v1 | null |
2024-02-21 | OlympiadBench: A Challenging Benchmark for Promoting AGI with Olympiad-Level Bilingual Multimodal Scientific Problems | Chaoqun He et.al. | 2402.14008v1 | link |
2024-02-21 | Hallucinations or Attention Misdirection? The Path to Strategic Value Extraction in Business Using Large Language Models | Aline Ioste et.al. | 2402.14002v1 | null |
2024-02-21 | Real-time 3D-aware Portrait Editing from a Single Image | Qingyan Bai et.al. | 2402.14000v1 | null |
2024-02-20 | CounterCurate: Enhancing Physical and Semantic Visio-Linguistic Compositional Reasoning via Counterfactual Examples | Jianrui Zhang et.al. | 2402.13254v1 | link |
2024-02-20 | BiMediX: Bilingual Medical Mixture of Experts LLM | Sara Pieri et.al. | 2402.13253v1 | link |
2024-02-20 | Video ReCap: Recursive Captioning of Hour-Long Videos | Md Mohaiminul Islam et.al. | 2402.13250v1 | null |
2024-02-20 | TofuEval: Evaluating Hallucinations of LLMs on Topic-Focused Dialogue Summarization | Liyan Tang et.al. | 2402.13249v1 | link |
2024-02-20 | Are Fact-Checking Tools Reliable? An Evaluation of Google Fact Check | Qiangeng Yang et.al. | 2402.13244v1 | null |
2024-02-20 | Unlocking Insights: Semantic Search in Jupyter Notebooks | Lan Li et.al. | 2402.13234v1 | null |
2024-02-20 | A Touch, Vision, and Language Dataset for Multimodal Alignment | Letian Fu et.al. | 2402.13232v1 | link |
2024-02-19 | FiT: Flexible Vision Transformer for Diffusion Model | Zeyu Lu et.al. | 2402.12376v1 | link |
2024-02-19 | A synthetic data approach for domain generalization of NLI models | Mohammad Javad Hosseini et.al. | 2402.12368v1 | null |
2024-02-19 | A Critical Evaluation of AI Feedback for Aligning Large Language Models | Archit Sharma et.al. | 2402.12366v1 | link |
2024-02-19 | Almost-linear time parameterized algorithm for rankwidth via dynamic rankwidth | Tuukka Korhonen et.al. | 2402.12364v1 | null |
2024-02-19 | Flip Graphs of Pseudo-Triangulations With Face Degree at Most 4 | Maarten Löffler et.al. | 2402.12357v1 | null |
2024-02-19 | Graph-Based Retriever Captures the Long Tail of Biomedical Knowledge | Julien Delile et.al. | 2402.12352v1 | null |
2024-02-16 | Fusion of Diffusion Weighted MRI and Clinical Data for Predicting Functional Outcome after Acute Ischemic Stroke with Deep Contrastive Learning | Chia-Ling Tsai et.al. | 2402.10894v1 | null |
2024-02-16 | RLVF: Learning from Verbal Feedback without Overgeneralization | Moritz Stephan et.al. | 2402.10893v1 | link |
2024-02-16 | Instruction Diversity Drives Generalization To Unseen Tasks | Dylan Zhang et.al. | 2402.10891v1 | null |
2024-02-16 | When is Tree Search Useful for LLM Planning? It Depends on the Discriminator | Ziru Chen et.al. | 2402.10890v1 | link |
2024-02-16 | Evaluation of EAP Usage for Authenticating Eduroam Users in 5G Networks | Leonardo Azalim de Oliveira et.al. | 2402.10889v1 | null |
2024-02-16 | Explainability for Machine Learning Models: From Data Adaptability to User Perception | julien Delaunay et.al. | 2402.10888v1 | null |
2024-02-16 | Reviewer2: Optimizing Review Generation Through Prompt Generation | Zhaolin Gao et.al. | 2402.10886v1 | null |
2024-02-16 | 3D Diffuser Actor: Policy Diffusion with 3D Scene Representations | Tsung-Wei Ke et.al. | 2402.10885v1 | null |
2024-02-15 | Self-Play Fine-Tuning of Diffusion Models for Text-to-Image Generation | Huizhuo Yuan et.al. | 2402.10210v1 | null |
2024-02-15 | Recovering the Pre-Fine-Tuning Weights of Generative Models | Eliahu Horwitz et.al. | 2402.10208v1 | link |
2024-02-15 | Rewards-in-Context: Multi-objective Alignment of Foundation Models with Dynamic Preference Adjustment | Rui Yang et.al. | 2402.10207v1 | link |
2024-02-15 | Unlocking the Potential of Transformers in Time Series Forecasting with Sharpness-Aware Minimization and Channel-Wise Attention | Romain Ilbert et.al. | 2402.10198v1 | link |
2024-02-15 | BitDelta: Your Fine-Tune May Only Be Worth One Bit | James Liu et.al. | 2402.10193v1 | link |
2024-02-15 | Multi-Excitation Projective Simulation with a Many-Body Physics Inspired Inductive Bias | Philip A. LeMaitre et.al. | 2402.10192v1 | link |
2024-02-15 | FedAnchor: Enhancing Federated Semi-Supervised Learning with Label Contrastive Loss for Unlabeled Clients | Xinchi Qiu et.al. | 2402.10191v1 | null |
2024-02-14 | AQA-Bench: An Interactive Benchmark for Evaluating LLMs’ Sequential Reasoning Ability | Siwei Yang et.al. | 2402.09404v1 | link |
2024-02-14 | Reinforcement Learning from Human Feedback with Active Queries | Kaixuan Ji et.al. | 2402.09401v1 | null |
2024-02-14 | Long-form evaluation of model editing | Domenic Rosati et.al. | 2402.09394v1 | null |
2024-02-14 | Introduction to Physically Unclonable Fuctions: Properties and Applications | M. Garcia-Bosque et.al. | 2402.09386v1 | null |
2024-02-14 | GraSSRep: Graph-Based Self-Supervised Learning for Repeat Detection in Metagenomic Assembly | Ali Azizpour et.al. | 2402.09381v1 | link |
2024-02-13 | IM-3D: Iterative Multiview Diffusion and Reconstruction for High-Quality 3D Generation | Luke Melas-Kyriazi et.al. | 2402.08682v1 | null |
2024-02-13 | Mitigating Object Hallucination in Large Vision-Language Models via Classifier-Free Guidance | Linxi Zhao et.al. | 2402.08680v1 | null |
2024-02-13 | COLD-Attack: Jailbreaking LLMs with Stealthiness and Controllability | Xingang Guo et.al. | 2402.08679v1 | link |
2024-02-13 | Graph Mamba: Towards Learning on Graphs with State Space Models | Ali Behrouz et.al. | 2402.08678v1 | link |
2024-02-13 | Model Assessment and Selection under Temporal Distribution Shift | Elise Han et.al. | 2402.08672v1 | link |
2024-02-13 | Rec-GPT4V: Multimodal Recommendation with Large Vision-Language Models | Yuqing Liu et.al. | 2402.08670v1 | null |
2024-02-13 | Improving Generalization in Semantic Parsing by Increasing Natural Language Variation | Irina Saparina et.al. | 2402.08666v1 | link |
2024-02-12 | A systematic investigation of learnability from single child linguistic input | Yulu Qin et.al. | 2402.07899v1 | null |
2024-02-12 | Label-Efficient Model Selection for Text Generation | Shir Ashury-Tahan et.al. | 2402.07891v1 | null |
2024-02-12 | Toward an Android Static Analysis Approach for Data Protection | Mugdha Khedkar et.al. | 2402.07889v1 | null |
2024-02-12 | WildfireGPT: Tailored Large Language Model for Wildfire Analysis | Yangxinyu Xie et.al. | 2402.07877v1 | null |
2024-02-12 | Policy Improvement using Language Feedback Models | Victor Zhong et.al. | 2402.07876v1 | null |
2024-02-09 | Feedback Loops With Language Models Drive In-Context Reward Hacking | Alexander Pan et.al. | 2402.06627v1 | link |
2024-02-09 | Understanding the Effects of Iterative Prompting on Truthfulness | Satyapriya Krishna et.al. | 2402.06625v1 | null |
2024-02-09 | A two-stage algorithm in evolutionary product unit neural networks for classification | Antonio J. Tallón-Ballesteros et.al. | 2402.06622v1 | null |
2024-02-09 | TIC: Translate-Infer-Compile for accurate ‘text to plan’ using LLMs and logical intermediate representations | Sudhir Agarwal et.al. | 2402.06608v1 | null |
2024-02-09 | On the Out-Of-Distribution Generalization of Multimodal Large Language Models | Xingxuan Zhang et.al. | 2402.06599v1 | null |
2024-02-09 | CigaR: Cost-efficient Program Repair with LLMs | Dávid Hidvégi et.al. | 2402.06598v1 | link |
2024-02-09 | Understanding the Weakness of Large Language Model Agents within a Complex Android Environment | Mingzhe Xing et.al. | 2402.06596v1 | link |
2024-02-08 | InstaGen: Enhancing Object Detection by Training on Synthetic Dataset | Chengjian Feng et.al. | 2402.05937v1 | null |
2024-02-08 | SPHINX-X: Scaling Data and Parameters for a Family of Multi-modal Large Language Models | Peng Gao et.al. | 2402.05935v1 | link |
2024-02-08 | Time Series Diffusion in the Frequency Domain | Jonathan Crabbé et.al. | 2402.05933v1 | link |
2024-02-08 | WebLINX: Real-World Website Navigation with Multi-Turn Dialogue | Xing Han Lù et.al. | 2402.05930v1 | link |
2024-02-08 | An Interactive Agent Foundation Model | Zane Durante et.al. | 2402.05929v1 | null |
2024-02-08 | Sharp Rates in Dependent Learning Theory: Avoiding Sample Size Deflation for the Square Loss | Ingvar Ziemann et.al. | 2402.05928v1 | null |
2024-02-07 | Image captioning for Brazilian Portuguese using GRIT model | Rafael Silva de Alencar et.al. | 2402.05106v1 | null |
2024-02-07 | You Can REST Now: Automated Specification Inference and Black-Box Testing of RESTful APIs with Large Language Models | Alix Decrop et.al. | 2402.05102v1 | null |
2024-02-07 | Hydragen: High-Throughput LLM Inference with Shared Prefixes | Jordan Juravsky et.al. | 2402.05099v1 | null |
2024-02-07 | On diffusion models for amortized inference: Benchmarking and improving stochastic control and sampling | Marcin Sendera et.al. | 2402.05098v1 | link |
2024-02-07 | Language-Based Augmentation to Address Shortcut Learning in Object Goal Navigation | Dennis Hoftijzer et.al. | 2402.05090v1 | null |
2024-02-07 | Hyperspectral acquisition with ScanImage at the single pixel level: Application to time domain coherent Raman imaging | Samuel Metais et.al. | 2402.05086v1 | null |
2024-02-06 | Linear-time Minimum Bayes Risk Decoding with Reference Aggregation | Jannis Vamvas et.al. | 2402.04251v1 | link |
2024-02-06 | CAST: Clustering Self-Attention using Surrogate Tokens for Efficient Transformers | Adjorn van Engelenhoven et.al. | 2402.04239v1 | null |
2024-02-06 | CogCoM: Train Large Vision-Language Models Diving into Details through Chain of Manipulations | Ji Qi et.al. | 2402.04236v1 | link |
2024-02-06 | Role of spontaneously generated coherence (SGC) in laser cooling of atoms | Rajnandan Choudhury Das et.al. | 2402.04234v1 | null |
2024-02-06 | Can Generative Agents Predict Emotion? | Ciaran Regan et.al. | 2402.04232v1 | null |
2024-02-06 | Further Constructions of AMUBs for Non-prime power Composite Dimensions | Ajeet Kumar et.al. | 2402.04231v1 | null |
2024-02-05 | Do Diffusion Models Learn Semantically Meaningful and Efficient Representations? | Qiyao Liang et.al. | 2402.03305v1 | null |
2024-02-05 | GUARD: Role-playing to Generate Natural-language Jailbreakings to Test Guideline Adherence of Large Language Models | Haibo Jin et.al. | 2402.03299v1 | null |
2024-02-05 | Ginger: An Efficient Curvature Approximation with Linear Complexity for General Neural Networks | Yongchang Hao et.al. | 2402.03295v1 | null |
2024-02-05 | InstanceDiffusion: Instance-level Control for Image Generation | Xudong Wang et.al. | 2402.03290v1 | link |
2024-02-05 | Make Every Move Count: LLM-based High-Quality RTL Code Generation Using MCTS | Matthew DeLorenzo et.al. | 2402.03289v1 | null |
2024-02-05 | A Lennard-Jones Layer for Distribution Normalization | Mulun Na et.al. | 2402.03287v1 | null |
2024-02-05 | Training-Free Consistent Text-to-Image Generation | Yoad Tewel et.al. | 2402.03286v1 | null |
2024-02-05 | Towards a Flexible Scale-out Framework for Efficient Visual Data Query Processing | Rohit Verma et.al. | 2402.03283v1 | null |
2024-02-02 | Position Paper: Generalized grammar rules and structure-based generalization beyond classical equivariance for lexical tasks and transduction | Mircea Petrache et.al. | 2402.01629v1 | null |
2024-02-02 | Stochastic Two Points Method for Deep Model Zeroth-order Optimization | Yijiang Pang et.al. | 2402.01621v1 | null |
2024-02-02 | MAGDi: Structured Distillation of Multi-Agent Interaction Graphs Improves Reasoning in Smaller Language Models | Justin Chih-Yao Chen et.al. | 2402.01620v1 | link |
2024-02-02 | Style Vectors for Steering Generative Large Language Model | Kai Konen et.al. | 2402.01618v1 | link |
2024-02-02 | A GP-based Robust Motion Planning Framework for Agile Autonomous Robot Navigation and Recovery in Unknown Environments | Nicholas Mohammad et.al. | 2402.01617v1 | null |
2024-02-01 | AToM: Amortized Text-to-Mesh using 2D Diffusion | Guocheng Qian et.al. | 2402.00867v1 | null |
2024-02-01 | Towards Optimal Feature-Shaping Methods for Out-of-Distribution Detection | Qinyu Zhao et.al. | 2402.00865v1 | link |
2024-02-01 | Evaluating Large Language Models for Generalization and Robustness via Data Compression | Yucheng Li et.al. | 2402.00861v1 | link |
2024-02-01 | Can Large Language Models Understand Context? | Yilun Zhu et.al. | 2402.00858v1 | null |
2024-02-01 | SymbolicAI: A framework for logic-based approaches combining generative models and solvers | Marius-Constantin Dinu et.al. | 2402.00854v1 | link |
2024-02-01 | LTAU-FF: Loss Trajectory Analysis for Uncertainty in Atomistic Force Fields | Joshua A. Vita et.al. | 2402.00853v1 | null |
2024-01-31 | Motion Guidance: Diffusion-Based Image Editing with Differentiable Motion Estimators | Daniel Geng et.al. | 2401.18085v1 | null |
2024-01-31 | Improved Scene Landmark Detection for Camera Localization | Tien Do et.al. | 2401.18083v1 | link |
2024-01-31 | Do Language Models Exhibit the Same Cognitive Biases in Problem Solving as Human Learners? | Andreas Opedal et.al. | 2401.18070v1 | null |
2024-01-30 | A simple, strong baseline for building damage detection on the xBD dataset | Sebastian Gerard et.al. | 2401.17271v1 | link |
2024-01-30 | Weaver: Foundation Models for Creative Writing | Tiannan Wang et.al. | 2401.17268v1 | null |
2024-01-30 | Proactive Detection of Voice Cloning with Localized Watermarking | Robin San Roman et.al. | 2401.17264v1 | link |
2024-01-30 | Weak-to-Strong Jailbreaking on Large Language Models | Xuandong Zhao et.al. | 2401.17256v1 | link |
2024-01-29 | Endo-4DGS: Distilling Depth Ranking for Endoscopic Monocular Scene Reconstruction with 4D Gaussian Splatting | Yiming Huang et.al. | 2401.16416v1 | null |
2024-01-29 | A Survey on Visual Anomaly Detection: Challenge, Approach, and Prospect | Yunkang Cao et.al. | 2401.16402v1 | null |
2024-01-29 | Amazon’s 2023 Drought: Sentinel-1 Reveals Extreme Rio Negro River Contraction | Fabien H Wagner et.al. | 2401.16393v1 | null |
2024-01-26 | EAGLE: Speculative Sampling Requires Rethinking Feature Uncertainty | Yuhui Li et.al. | 2401.15077v1 | link |
2024-01-26 | Annotated Hands for Generative Models | Yue Yang et.al. | 2401.15075v1 | link |
2024-01-26 | From GPT-4 to Gemini and Beyond: Assessing the Landscape of MLLMs on Generalizability, Trustworthiness and Causality through Four Modalities | Chaochao Lu et.al. | 2401.15071v1 | null |
2024-01-26 | Pairing Orthographically Variant Literary Words to Standard Equivalents Using Neural Edit Distance Models | Craig Messner et.al. | 2401.15068v1 | null |
2024-01-26 | Asymmetric Influence of the Amplitude-Dependent Tune Shift on the Transverse Mode-Coupling Instability | Miriam Brosi et.al. | 2401.15065v1 | null |
2024-01-26 | Expert with Clustering: Hierarchical Online Preference Learning Framework | Tianyue Zhou et.al. | 2401.15062v1 | null |
2024-01-25 | Deconstructing Denoising Diffusion Models for Self-Supervised Learning | Xinlei Chen et.al. | 2401.14404v1 | null |
2024-01-25 | O(1) Insertion for Random Walk d-ary Cuckoo Hashing up to the Load Threshold | Tolson Bell et.al. | 2401.14394v1 | null |
2024-01-25 | Inconsistency Masks: Removing the Uncertainty from Input-Pseudo-Label Pairs | Michael R. H. Vorndran et.al. | 2401.14387v1 | link |
2024-01-25 | Manifold GCN: Diffusion-based Convolutional Neural Network for Manifold-valued Graphs | Martin Hanik et.al. | 2401.14381v1 | null |
2024-01-25 | UrbanGenAI: Reconstructing Urban Landscapes using Panoptic Segmentation and Diffusion Models | Timo Kapsalis et.al. | 2401.14379v1 | null |
2024-01-24 | Graph-Informed Neural Networks for Sparse Grid-Based Discontinuity Detectors | Francesco Della Santa et.al. | 2401.13652v1 | link |
2024-01-24 | Employing polyhedral methods to optimize stencils on FPGAs with stencil-specific caches, data reuse, and wide data bursts | Florian Mayer et.al. | 2401.13645v1 | null |
2024-01-24 | Unveiling homophily beyond the pool of opportunities | Sina Sajjadi et.al. | 2401.13642v1 | null |
2024-01-23 | GALA: Generating Animatable Layered Assets from a Single Scan | Taeksoo Kim et.al. | 2401.12979v1 | null |
2024-01-23 | Zero-Shot Learning for the Primitives of 3D Affordance in General Objects | Hyeonwoo Kim et.al. | 2401.12978v1 | null |
2024-01-23 | In-Context Language Learning: Arhitectures and Algorithms | Ekin Akyürek et.al. | 2401.12973v1 | link |
2024-01-23 | Raidar: geneRative AI Detection viA Rewriting | Chengzhi Mao et.al. | 2401.12970v1 | link |
2024-01-23 | Minimizing the Age of Two Heterogeneous Sources With Packet Drops Via Cyclic Schedulers | Sahan Liyanaarachchi et.al. | 2401.12962v1 | null |
2024-01-23 | Chatterbox: Robust Transport for LLM Token Streaming under Unstable Network | Hanchen Li et.al. | 2401.12961v1 | null |
2024-01-22 | Exploring Simple Open-Vocabulary Semantic Segmentation | Zihang Lai et.al. | 2401.12217v1 | link |
2024-01-22 | Genericity Through Stratification | Victor Arrial et.al. | 2401.12212v1 | null |
2024-01-22 | OK-Robot: What Really Matters in Integrating Open-Knowledge Models for Robotics | Peiqi Liu et.al. | 2401.12202v1 | link |
2024-01-22 | APT: Adaptive Pruning and Tuning Pretrained Language Models for Efficient Training and Inference | Bowen Zhao et.al. | 2401.12200v1 | null |
2024-01-22 | Learning Dynamics from Multicellular Graphs with Deep Neural Networks | Haiqian Yang et.al. | 2401.12196v1 | null |
2024-01-22 | Text Embedding Inversion Attacks on Multilingual Language Models | Yiyi Chen et.al. | 2401.12192v1 | null |
2024-01-19 | Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data | Lihe Yang et.al. | 2401.10891v1 | link |
2024-01-19 | Event detection from novel data sources: Leveraging satellite imagery alongside GPS traces | Ekin Ugurel et.al. | 2401.10890v1 | link |
2024-01-19 | Synthesizing Moving People with 3D Control | Boyi Li et.al. | 2401.10889v1 | null |
2024-01-19 | Pruning for Protection: Increasing Jailbreak Resistance in Aligned LLMs Without Fine-Tuning | Adib Hasan et.al. | 2401.10862v1 | link |
2024-01-18 | ParaHome: Parameterizing Everyday Home Activities Towards 3D Generative Modeling of Human-Object Interactions | Jeonghwan Kim et.al. | 2401.10232v1 | null |
2024-01-18 | Simultaneous Tactile Estimation and Control for Extrinsic Dexterity | Antonia Bronars et.al. | 2401.10230v1 | null |
2024-01-18 | RAP-SAM: Towards Real-Time All-Purpose Segment Anything | Shilin Xu et.al. | 2401.10228v1 | link |
2024-01-18 | A Simple Latent Diffusion Approach for Panoptic Segmentation and Mask Inpainting | Wouter Van Gansbeke et.al. | 2401.10227v1 | link |
2024-01-18 | The Manga Whisperer: Automatically Generating Transcriptions for Comics | Ragav Sachdeva et.al. | 2401.10224v1 | link |
2024-01-18 | Supervised Fine-tuning in turn Improves Visual Foundation Models | Xiaohu Jiang et.al. | 2401.10222v1 | link |
2024-01-18 | AutoFT: Robust Fine-Tuning by Optimizing Hyperparameters on OOD Data | Caroline Choi et.al. | 2401.10220v1 | null |
2024-01-18 | Explaining the Implicit Neural Canvas: Connecting Pixels to Neurons by Tracing their Contributions | Namitha Padmanabhan et.al. | 2401.10217v1 | null |
2024-01-18 | GPAvatar: Generalizable and Precise Head Avatar from Image(s) | Xuangeng Chu et.al. | 2401.10215v1 | link |
2024-01-17 | Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model | Lianghui Zhu et.al. | 2401.09417v1 | link |
2024-01-17 | Vlogger: Make Your Dream A Vlog | Shaobin Zhuang et.al. | 2401.09414v1 | link |
2024-01-17 | Deciphering Textual Authenticity: A Generalized Strategy through the Lens of Large Language Semantics for Detecting Human vs. Machine-Generated Text | Mazal Bethany et.al. | 2401.09407v1 | null |
2024-01-16 | Machine Translation with Large Language Models: Prompt Engineering for Persian, English, and Russian Directions | Nooshin Pourkamali et.al. | 2401.08429v1 | null |
2024-01-16 | Three ways that non-differentiability affects neural network training | Siddharth Krishna Kumar et.al. | 2401.08426v1 | null |
2024-01-16 | U-DIADS-Bib: a full and few-shot pixel-precise dataset for document layout analysis of ancient manuscripts | Silvia Zottin et.al. | 2401.08425v1 | null |
2024-01-16 | Ask the experts: sourcing high-quality datasets for nutritional counselling through Human-AI collaboration | Simone Balloccu et.al. | 2401.08420v1 | link |
2024-01-16 | Contrastive Preference Optimization: Pushing the Boundaries of LLM Performance in Machine Translation | Haoran Xu et.al. | 2401.08417v1 | link |
2024-01-12 | Automated Test Case Repair Using Language Models | Ahmadreza Saboor Yaraghi et.al. | 2401.06765v1 | null |
2024-01-12 | APAR: LLMs Can Do Auto-Parallel Auto-Regressive Decoding | Mingdao Liu et.al. | 2401.06761v1 | null |
2024-01-12 | Synthetic Data Generation Framework, Dataset, and Efficient Deep Model for Pedestrian Intention Prediction | Muhammad Naveed Riaz et.al. | 2401.06757v1 | null |
2024-01-12 | Stylometry Analysis of Multi-authored Documents for Authorship and Author Style Change Detection | Muhammad Tayyab Zamir et.al. | 2401.06752v1 | null |
2024-01-12 | The Unreasonable Effectiveness of Easy Training Data for Hard Tasks | Peter Hase et.al. | 2401.06751v1 | link |
2024-01-12 | Measure Theoretic Reeb Graphs and Reeb Spaces | Qingsong Wang et.al. | 2401.06748v1 | null |
2024-01-11 | Distilling Vision-Language Models on Millions of Videos | Yue Zhao et.al. | 2401.06129v1 | null |
2024-01-11 | E $^{2}$ GAN: Efficient Training of Efficient GANs for Image-to-Image Translation | Yifan Gong et.al. | 2401.06127v1 | null |
2024-01-11 | Dubbing for Everyone: Data-Efficient Visual Dubbing using Neural Rendering Priors | Jack Saunders et.al. | 2401.06126v1 | null |
2024-01-11 | Manipulating Feature Visualizations with Gradient Slingshots | Dilyara Bareeva et.al. | 2401.06122v1 | link |
2024-01-11 | Gaussian Shadow Casting for Neural Characters | Luis Bolanos et.al. | 2401.06116v1 | null |
2024-01-11 | Jupyter widgets and extensions for education and research in computational physics and chemistry | Dou Du et.al. | 2401.06113v1 | null |
2024-01-10 | InseRF: Text-Driven Generative Object Insertion in Neural 3D Scenes | Mohamad Shahbazi et.al. | 2401.05335v1 | null |
2024-01-10 | URHand: Universal Relightable Hands | Zhaoxi Chen et.al. | 2401.05334v1 | null |
2024-01-10 | \textit{SmartMME}: Implementation of Base Station Switching Off Strategy in ns-3 | Argha Sen et.al. | 2401.05329v1 | null |
2024-01-10 | Leveraging Print Debugging to Improve Code Generation in Large Language Models | Xueyu Hu et.al. | 2401.05319v1 | null |
2024-01-10 | Can Probabilistic Feedback Drive User Impacts in Online Platforms? | Jessica Dai et.al. | 2401.05304v1 | null |
2024-01-09 | Morphable Diffusion: 3D-Consistent Diffusion for Single-image Avatar Creation | Xiyi Chen et.al. | 2401.04728v1 | null |
2024-01-09 | Low-Resource Vision Challenges for Foundation Models | Yunhua Zhang et.al. | 2401.04716v1 | null |
2024-01-09 | Bin Packing under Random-Order: Breaking the Barrier of 3/2 | Anish Hebbar et.al. | 2401.04714v1 | link |
2024-01-09 | RNA-TransCrypt: Image Encryption Using Chaotic RNA Encoding, Novel Transformative Substitution, and Tailored Cryptographic Operations | Muhammad Shahbaz Khan et.al. | 2401.04707v1 | null |
2024-01-08 | AGG: Amortized Generative 3D Gaussians for Single Image to 3D | Dejia Xu et.al. | 2401.04099v1 | null |
2024-01-08 | Modeling AoII in Push- and Pull-Based Sampling of Continuous Time Markov Chains | Ismail Cosandal et.al. | 2401.04098v1 | null |
2024-01-08 | GPT-4V(ision) is a Human-Aligned Evaluator for Text-to-3D Generation | Tong Wu et.al. | 2401.04092v1 | link |
2024-01-08 | Mixtral of Experts | Albert Q. Jiang et.al. | 2401.04088v1 | null |
2024-01-05 | Denoising Vision Transformers | Jiawei Yang et.al. | 2401.02957v1 | link |
2024-01-05 | Locally Adaptive Neural 3D Morphable Models | Michail Tarasiou et.al. | 2401.02937v1 | link |
2024-01-05 | Towards ASR Robust Spoken Language Understanding Through In-Context Learning With Word Confusion Networks | Kevin Everson et.al. | 2401.02921v1 | null |
2024-01-04 | Learning to Prompt with Text Only Supervision for Vision-Language Models | Muhammad Uzair Khattak et.al. | 2401.02418v1 | link |
2024-01-04 | LLaMA Pro: Progressive LLaMA with Block Expansion | Chengyue Wu et.al. | 2401.02415v1 | link |
2024-01-04 | LLM Augmented LLMs: Expanding Capabilities through Composition | Rachit Bansal et.al. | 2401.02412v1 | null |
2024-01-04 | What You See is What You GAN: Rendering Every Pixel for High-Fidelity Geometry in 3D GANs | Alex Trevithick et.al. | 2401.02411v1 | null |
2024-01-04 | Correctness Comparison of ChatGPT-4, Bard, Claude-2, and Copilot for Spatial Tasks | Hartwig H. Hochmair et.al. | 2401.02404v1 | null |
2024-01-04 | 3D Open-Vocabulary Panoptic Segmentation with 2D-3D Vision-Language Distillation | Zihao Xiao et.al. | 2401.02402v1 | null |
2024-01-04 | Learning the 3D Fauna of the Web | Zizhang Li et.al. | 2401.02400v1 | null |
2024-01-03 | From Audio to Photoreal Embodiment: Synthesizing Humans in Conversations | Evonne Ng et.al. | 2401.01885v1 | link |
2024-01-03 | A rewriting-logic-with-SMT-based formal analysis and parameter synthesis framework for parametric time Petri nets | Jaime Arias et.al. | 2401.01884v1 | null |
2024-01-03 | Theoretical guarantees on the best-of-n alignment policy | Ahmad Beirami et.al. | 2401.01879v1 | null |
2024-01-03 | Graph Neural Networks for Surfactant Multi-Property Prediction | Christoforos Brozos et.al. | 2401.01874v1 | link |
2024-01-03 | Dataset Difficulty and the Role of Inductive Bias | Devin Kwok et.al. | 2401.01867v1 | null |
2024-01-02 | Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models | Zixiang Chen et.al. | 2401.01335v1 | link |
2024-01-02 | An Autoregressive Text-to-Graph Framework for Joint Entity and Relation Extraction | Zaratiana Urchade et.al. | 2401.01326v1 | link |
2024-01-02 | A Comprehensive Survey of Hallucination Mitigation Techniques in Large Language Models | S. M Towhidul Islam Tonmoy et.al. | 2401.01313v1 | null |
2024-01-02 | On the uniqueness and computation of commuting extensions | Pascal Koiran et.al. | 2401.01302v1 | null |
2023-12-29 | K-PERM: Personalized Response Generation Using Dynamic Knowledge Retrieval and Persona-Adaptive Queries | Kanak Raj et.al. | 2312.17748v1 | link |
2023-12-28 | Do Androids Know They’re Only Dreaming of Electric Sheep? | Sky CH-Wang et.al. | 2312.17249v1 | null |
2023-12-28 | Rethinking Model-based, Policy-based, and Value-based Reinforcement Learning via the Lens of Representation Complexity | Guhao Feng et.al. | 2312.17248v1 | null |
2023-12-28 | The LLM Surgeon | Tycho F. A. van der Ouderaa et.al. | 2312.17244v1 | link |
2023-12-28 | Unsupervised Universal Image Segmentation | Dantong Niu et.al. | 2312.17243v1 | link |
2023-12-28 | Learning to Generate Text in Arbitrary Writing Styles | Aleem Khan et.al. | 2312.17242v1 | null |
2023-12-28 | An Improved Baseline for Reasoning Segmentation with Large Language Model | Senqiao Yang et.al. | 2312.17240v1 | null |
2023-12-28 | Fast Inference of Mixture-of-Experts Language Models with Offloading | Artyom Eliseev et.al. | 2312.17238v1 | link |
2023-12-28 | A Simple LLM Framework for Long-Range Video Question-Answering | Ce Zhang et.al. | 2312.17235v1 | link |
2023-12-28 | Personalized Restoration via Dual-Pivot Tuning | Pradyumna Chari et.al. | 2312.17234v1 | null |
2023-12-26 | Social-Transmotion: Promptable Human Trajectory Prediction | Saeed Saadatnejad et.al. | 2312.16168v1 | link |
2023-12-26 | Age of Information in Gossip Networks: A Friendly Introduction and Literature Survey | Priyanka Kaswan et.al. | 2312.16163v1 | null |
2023-12-26 | Zero-Shot Cross-Lingual Reranking with Large Language Models for Low-Resource Languages | Mofetoluwa Adeyemi et.al. | 2312.16159v1 | null |
2023-12-26 | From Text to Multimodal: A Comprehensive Survey of Adversarial Example Generation in Question Answering Systems | Gulsum Yigit et.al. | 2312.16156v1 | null |
2023-12-26 | Validating Light Phenomena Conceptual Assessment Through The Lens of CTT and IRT Frameworks | Purwoko Haryadi Santoso et.al. | 2312.16153v1 | null |
2023-12-26 | SoundCount: Sound Counting from Raw Audio with Dyadic Decomposition Neural Network | Yuhang He et.al. | 2312.16149v1 | null |
2023-12-22 | MACS: Mass Conditioned 3D Hand and Object Motion Synthesis | Soshi Shimada et.al. | 2312.14929v1 | null |
2023-12-22 | PoseGen: Learning to Generate 3D Human Pose Dataset with NeRF | Mohsen Gholami et.al. | 2312.14915v1 | link |
2023-12-21 | Virtual Pets: Animatable Animal Generation in 3D Scenes | Yen-Chi Cheng et.al. | 2312.14154v1 | null |
2023-12-21 | DriveLM: Driving with Graph Visual Question Answering | Chonghao Sima et.al. | 2312.14150v1 | link |
2023-12-21 | HeadCraft: Modeling High-Detail Shape Variations for Animated 3DMMs | Artem Sevastopolsky et.al. | 2312.14140v1 | null |
2023-12-21 | Diffusion Reward: Learning Rewards via Conditional Video Diffusion | Tao Huang et.al. | 2312.14134v1 | null |
2023-12-20 | Generative Multimodal Models are In-Context Learners | Quan Sun et.al. | 2312.13286v1 | link |
2023-12-20 | UniSDF: Unifying Neural Representations for High-Fidelity 3D Reconstruction of Complex Scenes with Reflections | Fangjinhua Wang et.al. | 2312.13285v1 | null |
2023-12-20 | Deep Learning on 3D Neural Fields | Pierluigi Zama Ramirez et.al. | 2312.13277v1 | null |
2023-12-20 | Repaint123: Fast and High-quality One Image to 3D Generation with Progressive Controllable 2D Repainting | Junwu Zhang et.al. | 2312.13271v1 | link |
2023-12-19 | Weakly Supervised Open-Vocabulary Object Detection | Jianghang Lin et.al. | 2312.12437v1 | null |
2023-12-19 | A Challenger to GPT-4V? Early Explorations of Gemini in Visual Expertise | Chaoyou Fu et.al. | 2312.12436v1 | link |
2023-12-19 | On Inference Stability for Diffusion Models | Viet Nguyen et.al. | 2312.12431v1 | link |
2023-12-19 | ROSE: A reduced-order scattering emulator for optical models | Daniel Odell et.al. | 2312.12426v1 | null |
2023-12-19 | SegRefiner: Towards Model-Agnostic Segmentation Refinement with Discrete Diffusion Process | Mengyu Wang et.al. | 2312.12425v1 | link |
2023-12-19 | Jack of All Tasks, Master of Many: Designing General-purpose Coarse-to-Fine Vision-Language Model | Shraman Pramanick et.al. | 2312.12423v1 | null |
2023-12-19 | Scene-Conditional 3D Object Stylization and Composition | Jinghao Zhou et.al. | 2312.12419v1 | null |
2023-12-18 | On Computing Makespan-Optimal Solutions for Generalized Sliding-Tile Puzzles | Marcus Gozon et.al. | 2312.10887v1 | null |
2023-12-18 | A novel diffusion recommendation algorithm based on multi-scale cnn and residual lstm | Yong Niu et.al. | 2312.10885v1 | null |
2023-12-18 | Sharable Clothoid-based Continuous Motion Planning for Connected Automated Vehicles | Sanghoon Oh et.al. | 2312.10880v1 | null |
2023-12-18 | Country-Scale Cropland Mapping in Data-Scarce Settings Using Deep Learning: A Case Study of Nigeria | Joaquin Gajardo et.al. | 2312.10872v1 | link |
2023-12-18 | From Google Gemini to OpenAI Q* (Q-Star): A Survey of Reshaping the Generative Artificial Intelligence (AI) Research Landscape | Timothy R. McIntosh et.al. | 2312.10868v1 | null |
2023-12-15 | Osprey: Pixel Understanding with Visual Instruction Tuning | Yuqian Yuan et.al. | 2312.10032v1 | link |
2023-12-15 | Wearable Coaxially-shielded Metamaterial for Magnetic Resonance Imaging | Xia Zhu et.al. | 2312.10018v1 | null |
2023-12-15 | Movement Primitive Diffusion: Learning Gentle Robotic Manipulation of Deformable Objects | Paul Maria Scheikl et.al. | 2312.10008v1 | null |
2023-12-15 | Faithful Persona-based Conversational Dataset Generation with Large Language Models | Pegah Jandaghi et.al. | 2312.10007v1 | link |
2023-12-14 | LIME: Localized Image Editing via Attention Regularization in Diffusion Models | Enis Simsar et.al. | 2312.09256v1 | null |
2023-12-14 | Revisiting Depth Completion from a Stereo Matching Perspective for Cross-domain Generalization | Luca Bartolomei et.al. | 2312.09254v1 | link |
2023-12-14 | FineControlNet: Fine-level Text Control for Image Generation with Spatially Aligned Text Control Injection | Hongsuk Choi et.al. | 2312.09252v1 | null |
2023-12-14 | VL-GPT: A Generative Pre-trained Transformer for Vision and Language Understanding and Generation | Jinguo Zhu et.al. | 2312.09251v1 | link |
2023-12-14 | Single Mesh Diffusion Models with Field Latents for Texture Generation | Thomas W. Mitchel et.al. | 2312.09250v1 | null |
2023-12-14 | ZeroRF: Fast Sparse View 360° Reconstruction with Zero Pretraining | Ruoxi Shi et.al. | 2312.09249v1 | null |
2023-12-14 | Helping or Herding? Reward Model Ensembles Mitigate but do not Eliminate Reward Hacking | Jacob Eisenstein et.al. | 2312.09244v1 | null |
2023-12-14 | OccNeRF: Self-Supervised Multi-Camera Occupancy Prediction with Neural Radiance Fields | Chubin Zhang et.al. | 2312.09243v1 | link |
2023-12-14 | Text2Immersion: Generative Immersive Scene with 3D Gaussians | Hao Ouyang et.al. | 2312.09242v1 | null |
2023-12-13 | SAM-guided Graph Cut for 3D Instance Segmentation | Haoyu Guo et.al. | 2312.08372v1 | null |
2023-12-13 | PTT: Point-Trajectory Transformer for Efficient Temporal 3D Object Detection | Kuan-Chih Huang et.al. | 2312.08371v1 | link |
2023-12-13 | An Invitation to Deep Reinforcement Learning | Bernhard Jaeger et.al. | 2312.08365v1 | null |
2023-12-13 | View-Dependent Octree-based Mesh Extraction in Unbounded Scenes for Procedural Synthetic Data | Zeyu Ma et.al. | 2312.08364v1 | link |
2023-12-13 | On the Computational Hardness of Quantum One-Wayness | Bruno Cavalar et.al. | 2312.08363v1 | null |
2023-12-13 | Distributed Inference and Fine-tuning of Large Language Models Over The Internet | Alexander Borzunov et.al. | 2312.08361v1 | null |
2023-12-12 | diff History for Long-Context Language Agents | Ulyana Piterbarg et.al. | 2312.07540v1 | link |
2023-12-12 | HeadArtist: Text-conditioned 3D Head Generation with Self Score Distillation | Hongyu Liu et.al. | 2312.07539v1 | null |
2023-12-12 | FreeInit: Bridging Initialization Gap in Video Diffusion Models | Tianxing Wu et.al. | 2312.07537v1 | link |
2023-12-12 | FreeControl: Training-Free Spatial Control of Any Text-to-Image Diffusion Model with Any Condition | Sicheng Mo et.al. | 2312.07536v1 | null |
2023-12-12 | Interfacing Foundation Models’ Embeddings | Xueyan Zou et.al. | 2312.07532v1 | link |
2023-12-12 | Topological Obstructions and How to Avoid Them | Babak Esmaeili et.al. | 2312.07529v1 | null |
2023-12-11 | CAD: Photorealistic 3D Generation via Adversarial Distillation | Ziyu Wan et.al. | 2312.06663v1 | null |
2023-12-11 | Photorealistic Video Generation with Diffusion Models | Agrim Gupta et.al. | 2312.06662v1 | null |
2023-12-11 | UpFusion: Novel View Diffusion from Unposed Sparse View Observations | Bharath Raj Nagoor Kani et.al. | 2312.06661v1 | null |
2023-12-11 | EdgeSAM: Prompt-In-the-Loop Distillation for On-Device Deployment of SAM | Chong Zhou et.al. | 2312.06660v1 | link |
2023-12-11 | Sherpa3D: Boosting High-Fidelity Text-to-3D Generation via Coarse 3D Prior | Fangfu Liu et.al. | 2312.06655v1 | link |
2023-12-11 | LightSim: Neural Lighting Simulation for Urban Scenes | Ava Pun et.al. | 2312.06654v1 | null |
2023-12-11 | Adaptive Human Trajectory Prediction via Latent Corridors | Neerja Thakkar et.al. | 2312.06653v1 | null |
2023-12-11 | Nuvo: Neural UV Mapping for Unruly 3D Representations | Pratul P. Srinivasan et.al. | 2312.05283v1 | null |
2023-12-08 | KBFormer: A Diffusion Model for Structured Entity Completion | Ouail Kitouni et.al. | 2312.05253v1 | null |
2023-12-08 | Laboratory realization of relativistic pair-plasma beams | C. D. Arrowsmith et.al. | 2312.05244v1 | null |
2023-12-08 | Contra generative AI detection in higher education assessments | Cesare G. Ardito et.al. | 2312.05241v1 | null |
2023-12-08 | SwiftBrush: One-Step Text-to-Image Diffusion Model with Variational Score Distillation | Thuan Hoang Nguyen et.al. | 2312.05239v1 | null |
2023-12-08 | Seeing ChatGPT Through Universities’ Policies, Resources and Guidelines | Hui Wang et.al. | 2312.05235v1 | null |
2023-12-07 | Scaling Laws of Synthetic Images for Model Training … for Now | Lijie Fan et.al. | 2312.04567v1 | link |
2023-12-07 | Gen2Det: Generate to Detect | Saksham Suri et.al. | 2312.04566v1 | null |
2023-12-07 | MuRF: Multi-Baseline Radiance Fields | Haofei Xu et.al. | 2312.04565v1 | link |
2023-12-07 | GenDeF: Learning Generative Deformation Field for Video Generation | Wen Wang et.al. | 2312.04561v1 | null |
2023-12-07 | NeRFiller: Completing Scenes via Generative 3D Inpainting | Ethan Weber et.al. | 2312.04560v1 | null |
2023-12-07 | PrimDiffusion: Volumetric Primitives Diffusion for 3D Human Generation | Zhaoxi Chen et.al. | 2312.04559v1 | link |
2023-12-07 | GenTron: Delving Deep into Diffusion Transformers for Image and Video Generation | Shoufa Chen et.al. | 2312.04557v1 | null |
2023-12-07 | Large Language Models for Mathematicians | Simon Frieder et.al. | 2312.04556v1 | null |
2023-12-07 | Improved Visual Grounding through Self-Consistent Explanations | Ruozhen He et.al. | 2312.04554v1 | null |
2023-12-07 | Generating Illustrated Instructions | Sachit Menon et.al. | 2312.04552v1 | null |
2023-12-06 | Relightable Gaussian Codec Avatars | Shunsuke Saito et.al. | 2312.03704v1 | null |
2023-12-06 | Skeleton-in-Context: Unified Skeleton Sequence Modeling with In-Context Learning | Xinshun Wang et.al. | 2312.03703v1 | link |
2023-12-06 | Self-conditioned Image Generation via Generating Representations | Tianhong Li et.al. | 2312.03701v1 | link |
2023-12-06 | Intrinsic Harmonization for Illumination-Aware Compositing | Chris Careaga et.al. | 2312.03698v1 | link |
2023-12-06 | Efficient Learning in Polyhedral Games via Best Response Oracles | Darshan Chakrabarti et.al. | 2312.03696v1 | null |
2023-12-06 | Memory Triggers: Unveiling Memorization in Text-To-Image Generative Models through Word-Level Duplication | Ali Naseh et.al. | 2312.03692v1 | null |
2023-12-06 | On the Role of Edge Dependency in Graph Generative Models | Sudhanshu Chanpuriya et.al. | 2312.03691v1 | null |
2023-12-06 | Evaluating and Mitigating Discrimination in Language Model Decisions | Alex Tamkin et.al. | 2312.03689v1 | null |
2023-12-05 | GPT4Point: A Unified Framework for Point-Language Understanding and Generation | Zhangyang Qi et.al. | 2312.02980v1 | null |
2023-12-05 | Imitating Shortest Paths in Simulation Enables Effective Navigation and Manipulation in the Real World | Kiana Ehsani et.al. | 2312.02976v1 | null |
2023-12-05 | Describing Differences in Image Sets with Natural Language | Lisa Dunlap et.al. | 2312.02974v1 | link |
2023-12-05 | Alchemist: Parametric Control of Material Properties with Diffusion Models | Prafull Sharma et.al. | 2312.02970v1 | null |
2023-12-05 | Rank-without-GPT: Building GPT-Independent Listwise Rerankers on Open-Source Large Language Models | Xinyu Zhang et.al. | 2312.02969v1 | null |
2023-12-05 | AmbiGen: Generating Ambigrams from Pre-trained Diffusion Model | Boheng Zhao et.al. | 2312.02967v1 | null |
2023-12-05 | Diffusion-SS3D: Diffusion Model for Semi-supervised 3D Object Detection | Cheng-Ju Ho et.al. | 2312.02966v1 | link |
2023-12-05 | MVHumanNet: A Large-scale Dataset of Multi-view Daily Dressing Human Captures | Zhangyang Xiong et.al. | 2312.02963v1 | null |
2023-12-04 | Aligning and Prompting Everything All at Once for Universal Visual Perception | Yunhang Shen et.al. | 2312.02153v1 | link |
2023-12-04 | Readout Guidance: Learning Control from Diffusion Features | Grace Luo et.al. | 2312.02150v1 | null |
2023-12-04 | Generative Powers of Ten | Xiaojuan Wang et.al. | 2312.02149v1 | null |
2023-12-04 | Rejuvenating image-GPT as Strong Visual Representation Learners | Sucheng Ren et.al. | 2312.02147v1 | link |
2023-12-04 | Repurposing Diffusion-Based Image Generators for Monocular Depth Estimation | Bingxin Ke et.al. | 2312.02145v1 | link |
2023-12-04 | Optimizing Camera Configurations for Multi-View Pedestrian Detection | Yunzhong Hou et.al. | 2312.02144v1 | null |
2023-12-04 | Competition-Level Problems Are Effective Evaluators of LLMs | Yiming Huang et.al. | 2312.02143v1 | null |
2023-12-04 | Object Recognition as Next Token Prediction | Kaiyu Yue et.al. | 2312.02142v1 | link |
2023-12-01 | VideoBooth: Diffusion-based Video Generation with Image Prompts | Yuming Jiang et.al. | 2312.00777v1 | null |
2023-12-01 | Towards Generalizable Zero-Shot Manipulation via Translating Human Interaction Plans | Homanga Bharadhwaj et.al. | 2312.00775v1 | null |
2023-12-01 | Beyond ChatBots: ExploreLLM for Structured Thoughts and Personalized Model Responses | Xiao Ma et.al. | 2312.00763v1 | null |
2023-12-01 | Mamba: Linear-Time Sequence Modeling with Selective State Spaces | Albert Gu et.al. | 2312.00752v1 | link |
2023-12-01 | Reduction from sparse LPN to LPN, Dual Attack 3.0 | Kévin Carrier et.al. | 2312.00747v1 | null |
2023-12-01 | Adversarial Score Distillation: When score distillation meets GAN | Min Wei et.al. | 2312.00739v1 | link |
2023-11-30 | Dataset Distillation in Large Data Era | Zeyuan Yin et.al. | 2311.18838v1 | link |
2023-11-30 | VIDiff: Translating Videos via Multi-Modal Instructions with Diffusion Models | Zhen Xing et.al. | 2311.18837v1 | null |
2023-11-30 | PoseGPT: Chatting about 3D Human Pose | Yao Feng et.al. | 2311.18836v1 | null |
2023-11-30 | InstructSeq: Unifying Vision Tasks with Instruction-conditioned Multi-modal Sequence Generation | Rongyao Fang et.al. | 2311.18835v1 | link |
2023-11-30 | ART $\boldsymbol{\cdot}$ V: Auto-Regressive Text-to-Video Generation with Diffusion Models | Wenming Weng et.al. | 2311.18834v1 | null |
2023-11-30 | Exploiting Diffusion Prior for Generalizable Pixel-Level Semantic Prediction | Hsin-Ying Lee et.al. | 2311.18832v1 | link |
2023-11-30 | MotionEditor: Editing Video Motion via Content-Aware Diffusion | Shuyuan Tu et.al. | 2311.18830v1 | link |
2023-11-30 | MicroCinema: A Divide-and-Conquer Approach for Text-to-Video Generation | Yanhui Wang et.al. | 2311.18829v1 | null |
2023-11-30 | One-step Diffusion with Distribution Matching Distillation | Tianwei Yin et.al. | 2311.18828v1 | null |
2023-11-30 | An Adaptive Framework for Generalizing Network Traffic Prediction towards Uncertain Environments | Alexander Downey et.al. | 2311.18824v1 | null |
2023-11-29 | A Simple Recipe for Language-guided Domain Generalized Segmentation | Mohammad Fahes et.al. | 2311.17922v1 | null |
2023-11-29 | Do text-free diffusion models learn discriminative visual representations? | Soumik Mukhopadhyay et.al. | 2311.17921v1 | link |
2023-11-29 | Visual Anagrams: Generating Multi-View Optical Illusions with Diffusion Models | Daniel Geng et.al. | 2311.17919v1 | null |
2023-11-29 | Driving into the Future: Multiview Visual Forecasting and Planning with World Model for Autonomous Driving | Yuqi Wang et.al. | 2311.17918v1 | link |
2023-11-29 | AvatarStudio: High-fidelity and Animatable 3D Avatar Creation from Text | Jianfeng Zhang et.al. | 2311.17917v1 | null |
2023-11-29 | OPERA: Alleviating Hallucination in Multi-Modal Large Language Models via Over-Trust Penalty and Retrospection-Allocation | Qidong Huang et.al. | 2311.17911v1 | link |
2023-11-29 | HUGS: Human Gaussian Splats | Muhammed Kocabas et.al. | 2311.17910v1 | null |
2023-11-29 | CG3D: Compositional Generation for Text-to-3D via Gaussian Splatting | Alexander Vilesov et.al. | 2311.17907v1 | null |
2023-11-28 | HumanGaussian: Text-Driven 3D Human Generation with Gaussian Splatting | Xian Liu et.al. | 2311.17061v1 | null |
2023-11-28 | Material Palette: Extraction of Materials from a Single Image | Ivan Lopes et.al. | 2311.17060v1 | null |
2023-11-28 | Panoptic Video Scene Graph Generation | Jingkang Yang et.al. | 2311.17058v1 | link |
2023-11-28 | ReMoS: Reactive 3D Motion Synthesis for Two-Person Interactions | Anindita Ghosh et.al. | 2311.17057v1 | null |
2023-11-28 | Self-Supervised Motion Magnification by Backpropagating Through Optical Flow | Zhaoying Pan et.al. | 2311.17056v1 | null |
2023-11-28 | No Representation Rules Them All in Category Discovery | Sagar Vaze et.al. | 2311.17055v1 | null |
2023-11-28 | DiffuseBot: Breeding Soft Robots With Physics-Augmented Generative Diffusion Models | Tsun-Hsuan Wang et.al. | 2311.17053v1 | null |
2023-11-28 | Surf-D: High-Quality Surface Generation for Arbitrary Topologies using Diffusion Models | Zhengming Yu et.al. | 2311.17050v1 | null |
2023-11-27 | Video-Bench: A Comprehensive Benchmark and Toolkit for Evaluating Video-based Large Language Models | Munan Ning et.al. | 2311.16103v1 | link |
2023-11-27 | Test-time Adaptation of Discriminative Models via Diffusion Generative Feedback | Mihir Prabhudesai et.al. | 2311.16102v1 | null |
2023-11-27 | How Many Unicorns Are in This Image? A Safety Evaluation Benchmark for Vision LLMs | Haoqin Tu et.al. | 2311.16101v1 | link |
2023-11-27 | GART: Gaussian Articulated Template Models | Jiahui Lei et.al. | 2311.16099v1 | null |
2023-11-27 | On Bringing Robots Home | Nur Muhammad Mahi Shafiullah et.al. | 2311.16098v1 | link |
2023-11-27 | CG-HOI: Contact-Guided 3D Human-Object Interaction Generation | Christian Diller et.al. | 2311.16097v1 | null |
2023-11-27 | Animatable Gaussians: Learning Pose-dependent Gaussian Maps for High-fidelity Human Avatar Modeling | Zhe Li et.al. | 2311.16096v1 | link |
2023-11-27 | Self-correcting LLM-controlled Diffusion Models | Tsung-Han Wu et.al. | 2311.16090v1 | null |
2023-11-27 | DUnE: Dataset for Unified Editing | Afra Feyza Akyürek et.al. | 2311.16087v1 | link |
2023-11-24 | SEGIC: Unleashing the Emergent Correspondence for In-Context Segmentation | Lingchen Meng et.al. | 2311.14671v1 | link |
2023-11-24 | Data-driven Prior Learning for Bayesian Optimisation | Sigrid Passano Hellan et.al. | 2311.14653v1 | link |
2023-11-24 | One Pass Streaming Algorithm for Super Long Token Attention Approximation in Sublinear Space | Raghav Addanki et.al. | 2311.14652v1 | null |
2023-11-24 | History Filtering in Imperfect Information Games: Algorithms and Complexity | Christopher Solinas et.al. | 2311.14651v1 | null |
2023-11-22 | Retrieval-Augmented Layout Transformer for Content-Aware Layout Generation | Daichi Horita et.al. | 2311.13602v1 | null |
2023-11-22 | Visual In-Context Prompting | Feng Li et.al. | 2311.13601v1 | link |
2023-11-22 | ZipLoRA: Any Subject in Any Style by Effectively Merging LoRAs | Viraj Shah et.al. | 2311.13600v1 | null |
2023-11-22 | Risk-sensitive Markov Decision Process and Learning under General Utility Functions | Zhengqi Wu et.al. | 2311.13589v1 | null |
2023-11-22 | A Survey of Serverless Machine Learning Model Inference | Kamil Kojs et.al. | 2311.13587v1 | null |
2023-11-22 | On diffusion-based generative models and their error bounds: The log-concave case with full convergence estimates | Stefano Bruno et.al. | 2311.13584v1 | null |
2023-11-22 | PaSS: Parallel Speculative Sampling | Giovanni Monea et.al. | 2311.13581v1 | null |
2023-11-22 | Aufbau Suppressed Coupled Cluster Theory for Electronically Excited States | Harrison Tuckman et.al. | 2311.13576v1 | null |
2023-11-21 | Intrinsic Image Decomposition via Ordinal Shading | Chris Careaga et.al. | 2311.12792v1 | link |
2023-11-21 | Mechanistically analyzing the effects of fine-tuning on procedurally defined tasks | Samyak Jain et.al. | 2311.12786v1 | null |
2023-11-20 | Rate-Independent Gradient Crystal Plasticity Theory – Robust Algorithmic Formulations based on Incremental Energy Minimization | Volker Fohrmeister et.al. | 2311.12026v1 | null |
2023-11-20 | The allosteric lever: towards a principle of specific allosteric response | Maximilian Vossel et.al. | 2311.12025v1 | null |
2023-11-20 | PF-LRM: Pose-Free Large Reconstruction Model for Joint Pose and Shape Prediction | Peng Wang et.al. | 2311.12024v1 | null |
2023-11-20 | Macroscopic description of a heavy particle immersed within a flow of light particles | Radek Erban et.al. | 2311.12021v1 | null |
2023-11-20 | An Empirical Study of Self-Admitted Technical Debt in Machine Learning Software | Aaditya Bhatia et.al. | 2311.12019v1 | null |
2023-11-20 | GPT-4V(ision) for Robotics: Multimodal Task Planning from Human Demonstration | Naoki Wake et.al. | 2311.12015v1 | null |
2023-11-17 | Emu Video: Factorizing Text-to-Video Generation by Explicit Image Conditioning | Rohit Girdhar et.al. | 2311.10709v1 | null |
2023-11-17 | SelfEval: Leveraging the discriminative nature of generative models for evaluation | Sai Saketh Rambhatla et.al. | 2311.10708v1 | null |
2023-11-17 | Cactus Representations in Polylogarithmic Max-flow via Maximal Isolating Mincuts | Zhongtian He et.al. | 2311.10706v1 | null |
2023-11-16 | The Chosen One: Consistent Characters in Text-to-Image Diffusion Models | Omri Avrahami et.al. | 2311.10093v1 | null |
2023-11-16 | Traffic Video Object Detection using Motion Prior | Lihao Liu et.al. | 2311.10092v1 | null |
2023-11-16 | Adaptive Shells for Efficient Neural Radiance Field Rendering | Zian Wang et.al. | 2311.10091v1 | null |
2023-11-16 | Emu Edit: Precise Image Editing via Recognition and Generation Tasks | Shelly Sheynin et.al. | 2311.10089v1 | null |
2023-11-16 | DRESS: Instructing Large Vision-Language Models to Align and Interact with Humans via Natural Language Feedback | Yangyi Chen et.al. | 2311.10081v1 | null |
2023-11-16 | Improving 3D Synthetic Jet Modeling in a Crossflow | Howard Ho et.al. | 2311.10072v1 | null |
2023-11-15 | Single-Image 3D Human Digitization with Shape-Guided Diffusion | Badour AlBahar et.al. | 2311.09221v1 | null |
2023-11-15 | DMV3D: Denoising Multi-View Diffusion using 3D Large Reconstruction Model | Yinghao Xu et.al. | 2311.09217v1 | null |
2023-11-15 | Assessing Translation capabilities of Large Language Models involving English and Indian Languages | Vandan Mujadia et.al. | 2311.09216v1 | null |
2023-11-15 | GRIM: GRaph-based Interactive narrative visualization for gaMes | Jorge Leandro et.al. | 2311.09213v1 | null |
2023-11-15 | Controllable Text Summarization: Unraveling Challenges, Approaches, and Prospects – A Survey | Ashok Urlana et.al. | 2311.09212v1 | link |
2023-11-15 | Chain-of-Note: Enhancing Robustness in Retrieval-Augmented Language Models | Wenhao Yu et.al. | 2311.09210v1 | null |
2023-11-15 | A Unified Approach to Learning Ising Models: Beyond Independence and Bounded Width | Jason Gaitonde et.al. | 2311.09197v1 | null |
2023-11-15 | Self-Supervised Curriculum Generation for Autonomous Reinforcement Learning without Task-Specific Knowledge | Sang-Hyun Lee et.al. | 2311.09195v1 | null |
2023-11-15 | Structural Priming Demonstrates Abstract Grammatical Representations in Multilingual Language Models | James A. Michaelov et.al. | 2311.09194v1 | null |
2023-11-14 | Instant3D: Instant Text-to-3D Generation | Ming Li et.al. | 2311.08403v1 | null |
2023-11-14 | Fine-tuning Language Models for Factuality | Katherine Tian et.al. | 2311.08401v1 | null |
2023-11-14 | Towards Open-Ended Visual Recognition with Large Language Model | Qihang Yu et.al. | 2311.08400v1 | link |
2023-11-14 | Are Large Language Models Temporally Grounded? | Yifu Qiu et.al. | 2311.08398v1 | link |
2023-11-14 | MVSA-Net: Multi-View State-Action Recognition for Robust and Deployable Trajectory Generation | Ehsan Asali et.al. | 2311.08393v1 | null |
2023-11-14 | On What Basis? Predicting Text Preference Via Structured Comparative Reasoning | Jing Nathan Yan et.al. | 2311.08390v1 | null |
2023-11-14 | TSST: A Benchmark and Evaluation Models for Text Speech-Style Transfer | Huashan Sun et.al. | 2311.08389v1 | null |
2023-11-13 | To See is to Believe: Prompting GPT-4V for Better Visual Instruction Tuning | Junke Wang et.al. | 2311.07574v1 | link |
2023-11-13 | Realizability of Free Spaces of Curves | Hugo A. Akitaya et.al. | 2311.07573v1 | null |
2023-11-13 | Feature emergence via margin maximization: case studies in algebraic tasks | Depen Morwani et.al. | 2311.07568v1 | null |
2023-11-13 | GPT-4V in Wonderland: Large Multimodal Models for Zero-Shot Smartphone GUI Navigation | An Yan et.al. | 2311.07562v1 | link |
2023-11-13 | Fast Normalized Cross-Correlation for Template Matching with Rotations | José María Almira et.al. | 2311.07561v1 | null |
2023-11-13 | Sound Gradual Verification with Symbolic Execution | Conrad Zimmerman et.al. | 2311.07559v1 | null |
2023-11-13 | Data-Efficient Task Generalization via Probabilistic Model-based Meta Reinforcement Learning | Arjun Bhardwaj et.al. | 2311.07558v1 | null |
2023-11-10 | Parameter-Efficient Orthogonal Finetuning via Butterfly Factorization | Weiyang Liu et.al. | 2311.06243v1 | null |
2023-11-10 | Florence-2: Advancing a Unified Representation for a Variety of Vision Tasks | Bin Xiao et.al. | 2311.06242v1 | null |
2023-11-10 | Nonnegativity Problems for Matrix Semigroups | Julian D’Costa et.al. | 2311.06241v1 | null |
2023-11-10 | Summon a Demon and Bind it: A Grounded Theory of LLM Red Teaming in the Wild | Nanna Inie et.al. | 2311.06237v1 | null |
2023-11-10 | Deep Learning meets Blockchain for Automated and Secure Access Control | Asma Jodeiri Akbarfam et.al. | 2311.06236v1 | null |
2023-11-10 | Learning material synthesis-structure-property relationship by data fusion: Bayesian Co-regionalization N-Dimensional Piecewise Function Learning | A. Gilad Kusne et.al. | 2311.06228v1 | null |
2023-11-10 | Does Differential Privacy Prevent Backdoor Attacks in Practice? | Fereshteh Razmi et.al. | 2311.06227v1 | null |
2023-11-09 | What Do I Hear? Generating Sounds for Visuals with ChatGPT | David Chuan-En Lin et.al. | 2311.05609v1 | null |
2023-11-09 | Real-Time Neural Rasterization for Large Scenes | Jeffrey Yunfan Liu et.al. | 2311.05607v1 | null |
2023-11-09 | Diffusion-Generative Multi-Fidelity Learning for Physical Simulation | Zheng Wang et.al. | 2311.05606v1 | null |
2023-11-09 | 3D-QAE: Fully Quantum Auto-Encoding of 3D Point Clouds | Lakshika Rathi et.al. | 2311.05604v1 | null |
2023-11-09 | Reconstructing Objects in-the-wild for Realistic Sensor Simulation | Ze Yang et.al. | 2311.05602v1 | null |
2023-11-09 | SynH2R: Synthesizing Hand-Object Motions for Learning Human-to-Robot Handovers | Sammy Christen et.al. | 2311.05599v1 | null |
2023-11-09 | LLM Augmented Hierarchical Agents | Bharat Prakash et.al. | 2311.05596v1 | null |
2023-11-08 | GENOME: GenerativE Neuro-symbOlic visual reasoning by growing and reusing ModulEs | Zhenfang Chen et.al. | 2311.04901v1 | null |
2023-11-08 | How Abstract Is Linguistic Generalization in Large Language Models? Experiments with Argument Structure | Michael Wilson et.al. | 2311.04900v1 | link |
2023-11-08 | Optimized measurements of chaotic dynamical systems via the information bottleneck | Kieran A. Murphy et.al. | 2311.04896v1 | null |
2023-11-08 | The Monadic Theory of Toric Words | Valérie Berthé et.al. | 2311.04895v1 | null |
2023-11-08 | Bias Runs Deep: Implicit Reasoning Biases in Persona-Assigned LLMs | Shashank Gupta et.al. | 2311.04892v1 | link |
2023-11-08 | AutoChip: Automating HDL Generation Using LLM Feedback | Shailja Thakur et.al. | 2311.04887v1 | link |
2023-11-08 | SEMQA: Semi-Extractive Multi-Source Question Answering | Tal Schuster et.al. | 2311.04886v1 | link |
2023-11-07 | Towards Garment Sewing Pattern Reconstruction from a Single Image | Lijuan Liu et.al. | 2311.04218v1 | link |
2023-11-07 | Rephrase and Respond: Let Large Language Models Ask Better Questions for Themselves | Yihe Deng et.al. | 2311.04205v1 | link |
2023-11-07 | Sharp Thresholds Imply Circuit Lower Bounds: from random 2-SAT to Planted Clique | David Gamarnik et.al. | 2311.04204v1 | null |
2023-11-07 | Exploring Recommendation Capabilities of GPT-4V(ision): A Preliminary Case Study | Peilin Zhou et.al. | 2311.04199v1 | null |
2023-11-07 | JPAVE: A Generation and Classification-based Model for Joint Product Attribute Prediction and Value Extraction | Zhongfen Deng et.al. | 2311.04196v1 | link |
2023-11-06 | GLaMM: Pixel Grounding Large Multimodal Model | Hanoona Rasheed et.al. | 2311.03356v1 | null |
2023-11-06 | SegGen: Supercharging Segmentation Models with Text2Mask and Mask2Img Synthesis | Hanrong Ye et.al. | 2311.03355v1 | null |
2023-11-06 | CoVLM: Composing Visual Entities and Relationships in Large Language Models Via Communicative Decoding | Junyan Li et.al. | 2311.03354v1 | null |
2023-11-06 | Scalable and Transferable Black-Box Jailbreaks for Language Models via Persona Modulation | Rusheb Shah et.al. | 2311.03348v1 | null |
2023-11-06 | Decomposing Probability Marginals Beyond Affine Requirements | Jannik Matuschke et.al. | 2311.03346v1 | null |
2023-11-06 | Long-Term Invariant Local Features via Implicit Cross-Domain Correspondences | Zador Pataki et.al. | 2311.03345v1 | null |
2023-11-06 | Embedding First Order Logic into Kernel Machines | Michelangelo Diligenti et.al. | 2311.03340v1 | null |
2023-11-03 | EmerNeRF: Emergent Spatial-Temporal Scene Decomposition via Self-Supervision | Jiawei Yang et.al. | 2311.02077v1 | null |
2023-11-03 | Universal Sharpness Dynamics in Neural Network Training: Fixed Point Analysis, Edge of Stability, and Route to Chaos | Dayal Singh Kalra et.al. | 2311.02076v1 | null |
2023-11-03 | Envy-Free Cake-Cutting for Four Agents | Alexandros Hollender et.al. | 2311.02075v1 | null |
2023-11-03 | Learning Historical Status Prompt for Accurate and Robust Visual Tracking | Wenrui Cai et.al. | 2311.02072v1 | null |
2023-11-03 | Grounded Intuition of GPT-Vision’s Abilities with Scientific Images | Alyssa Hwang et.al. | 2311.02069v1 | link |
2023-11-03 | GroomGen: A High-Quality Generative Hair Model Using Hierarchical Latent Representations | Yuxiao Zhou et.al. | 2311.02062v1 | null |
2023-11-03 | Active Learning-Based Species Range Estimation | Christian Lange et.al. | 2311.02061v1 | link |
2023-11-02 | Idempotent Generative Network | Assaf Shocher et.al. | 2311.01462v1 | null |
2023-11-02 | Align Your Prompts: Test-Time Prompting with Distribution Alignment for Zero-Shot Generalization | Jameel Hassan et.al. | 2311.01459v1 | null |
2023-11-02 | Detecting Deepfakes Without Seeing Any | Tal Reiss et.al. | 2311.01458v1 | link |
2023-11-02 | RoboGen: Towards Unleashing Infinite Data for Automated Robot Learning via Generative Simulation | Yufei Wang et.al. | 2311.01455v1 | null |
2023-11-02 | NOIR: Neural Signal Operated Intelligent Robots for Everyday Activities | Ruohan Zhang et.al. | 2311.01454v1 | null |
2023-11-02 | DreamSmooth: Improving Model-based Reinforcement Learning via Reward Smoothing | Vint Lee et.al. | 2311.01450v1 | null |
2023-11-02 | UltraLiDAR: Learning Compact Representations for LiDAR Completion and Generation | Yuwen Xiong et.al. | 2311.01448v1 | null |
2023-11-02 | CADSim: Robust and Scalable in-the-wild 3D Reconstruction for Controllable Sensor Simulation | Jingkang Wang et.al. | 2311.01447v1 | null |
2023-11-02 | Adv3D: Generating Safety-Critical 3D Objects through Closed-Loop Simulation | Jay Sarva et.al. | 2311.01446v1 | null |
2023-11-01 | End-to-End Single-Channel Speaker-Turn Aware Conversational Speech Translation | Juan Zuluaga-Gomez et.al. | 2311.00697v1 | link |
2023-11-01 | Unleashing the Creative Mind: Language Model As Hierarchical Policy For Improved Exploration on Challenging Problem Solving | Zhan Ling et.al. | 2311.00694v1 | link |
2023-11-01 | Improving Interpersonal Communication by Simulating Audiences with Language Models | Ryan Liu et.al. | 2311.00687v1 | link |
2023-11-01 | Deep Learning-Based Classification of Gamma Photon Interactions in Room-Temperature Semiconductor Radiation Detectors | Sandeep K. Chaudhuri et.al. | 2311.00682v1 | null |
2023-11-01 | Are Large Language Models Reliable Judges? A Study on the Factuality Evaluation Capabilities of LLMs | Xue-Yong Fu et.al. | 2311.00681v1 | null |
2023-10-31 | Unexpected Improvements to Expected Improvement for Bayesian Optimization | Sebastian Ament et.al. | 2310.20708v1 | null |
2023-10-31 | What’s In My Big Data? | Yanai Elazar et.al. | 2310.20707v1 | link |
2023-10-31 | DDAM-PS: Diligent Domain Adaptive Mixer for Person Search | Mohammed Khaleed Almansoori et.al. | 2310.20706v1 | link |
2023-10-31 | SEINE: Short-to-Long Video Diffusion Model for Generative Transition and Prediction | Xinyuan Chen et.al. | 2310.20700v1 | null |
2023-11-01 | Bayesian Multistate Bennett Acceptance Ratio Methods | Xinqiang Ding et.al. | 2310.20699v2 | link |
2023-10-31 | Learning From Mistakes Makes LLM Better Reasoner | Shengnan An et.al. | 2310.20689v1 | link |
2023-10-31 | Compression with Exact Error Distribution for Federated Learning | Mahmoud Hegazy et.al. | 2310.20682v1 | null |
2023-10-30 | Variational principles for the hydrodynamics of the classical one-component plasma | Daniels Krimans et.al. | 2310.19239v1 | null |
2023-10-30 | Building Real-World Meeting Summarization Systems using Large Language Models: A Practical Perspective | Md Tahmid Rahman Laskar et.al. | 2310.19233v1 | null |
2023-10-30 | Stochastic Configuration Machines: FPGA Implementation | Matthew J. Felicetti et.al. | 2310.19225v1 | null |
2023-10-30 | CHAMMI: A benchmark for channel-adaptive models in microscopy imaging | Zitong Chen et.al. | 2310.19224v1 | link |
2023-10-27 | FP8-LM: Training FP8 Large Language Models | Houwen Peng et.al. | 2310.18313v1 | link |
2023-10-27 | Gen2Sim: Scaling up Robot Learning in Simulation with Generative Models | Pushkal Katara et.al. | 2310.18308v1 | null |
2023-10-27 | Interactive Motion Planning for Autonomous Vehicles with Joint Optimization | Yuxiao Chen et.al. | 2310.18301v1 | null |
2023-10-27 | Enhancing the Performance of a Biomimetic Robotic Elbow-and-Forearm System Through Bionics-Inspired Optimization | Haosen Yang et.al. | 2310.18299v1 | null |
2023-10-27 | Sharp-Edge Diffraction of Laguerre-Gauss Vortex Beams by Elliptic Apertures | Riccardo Borghi et.al. | 2310.18298v1 | null |
2023-10-27 | Addressing GAN Training Instabilities via Tunable Classification Losses | Monica Welfert et.al. | 2310.18291v1 | null |
2023-10-26 | Fantastic Gains and Where to Find Them: On the Existence and Prospect of General Knowledge Transfer between Any Pretrained Model | Karsten Roth et.al. | 2310.17653v1 | link |
2023-10-26 | A Coarse-to-Fine Pseudo-Labeling (C2FPL) Framework for Unsupervised Video Anomaly Detection | Anas Al-lahham et.al. | 2310.17650v1 | link |
2023-10-26 | 6-DoF Stability Field via Diffusion Models | Takuma Yoneda et.al. | 2310.17649v1 | null |
2023-10-26 | In-Context Learning Dynamics with Random Binary Sequences | Eric J. Bigelow et.al. | 2310.17639v1 | null |
2023-10-26 | Generative Fractional Diffusion Models | Gabriel Nobis et.al. | 2310.17638v1 | null |
2023-10-26 | JudgeLM: Fine-tuned Large Language Models are Scalable Judges | Lianghui Zhu et.al. | 2310.17631v1 | link |
2023-10-25 | SparseDFF: Sparse-View Feature Distillation for One-Shot Dexterous Manipulation | Qianxu Wang et.al. | 2310.16838v1 | null |
2023-10-25 | Proposal-Contrastive Pretraining for Object Detection from Fewer Data | Quentin Bouniot et.al. | 2310.16835v1 | null |
2023-10-25 | CommonCanvas: An Open Diffusion Model Trained with Creative-Commons Images | Aaron Gokaslan et.al. | 2310.16825v1 | link |
2023-10-26 | DreamCraft3D: Hierarchical 3D Generation with Bootstrapped Diffusion Prior | Jingxiang Sun et.al. | 2310.16818v2 | link |
2023-10-25 | The intelligent agent model – a fully two-dimensional microscopic traffic flow model | Martin Treiber et.al. | 2310.16816v1 | null |
2023-10-24 | Synthetic Data as Validation | Qixin Hu et.al. | 2310.16052v1 | null |
2023-10-24 | EquivAct: SIM(3)-Equivariant Visuomotor Policies beyond Rigid Object Manipulation | Jingyun Yang et.al. | 2310.16050v1 | null |
2023-10-24 | MuSR: Testing the Limits of Chain-of-thought with Multistep Soft Reasoning | Zayne Sprague et.al. | 2310.16049v1 | link |
2023-10-24 | From Posterior Sampling to Meaningful Diversity in Image Restoration | Noa Cohen et.al. | 2310.16047v1 | null |
2023-10-24 | Woodpecker: Hallucination Correction for Multimodal Large Language Models | Shukang Yin et.al. | 2310.16045v1 | link |
2023-10-25 | Stanford-ORB: A Real-World 3D Object Inverse Rendering Benchmark | Zhengfei Kuang et.al. | 2310.16044v2 | link |
2023-10-25 | WebWISE: Web Interface Control and Sequential Exploration with Large Language Models | Heyi Tao et.al. | 2310.16042v2 | null |
2023-10-24 | Instruct and Extract: Instruction Tuning for On-Demand Information Extraction | Yizhu Jiao et.al. | 2310.16040v1 | link |
2023-10-23 | FreeNoise: Tuning-Free Longer Video Diffusion Via Noise Rescheduling | Haonan Qiu et.al. | 2310.15169v1 | link |
2023-10-24 | Ghost on the Shell: An Expressive Representation of General 3D Shapes | Zhen Liu et.al. | 2310.15168v2 | null |
2023-10-23 | SAM-Med3D | Haoyu Wang et.al. | 2310.15161v1 | link |
2023-10-23 | FreeMask: Synthetic Images with Dense Annotations Make Stronger Segmentation Models | Lihe Yang et.al. | 2310.15160v1 | link |
2023-10-23 | Online Detection of AI-Generated Images | David C. Epstein et.al. | 2310.15150v1 | null |
2023-10-23 | DEsignBench: Exploring and Benchmarking DALL-E 3 for Imagining Visual Design | Kevin Lin et.al. | 2310.15144v1 | link |
2023-10-23 | SpecTr: Fast Speculative Decoding via Optimal Transport | Ziteng Sun et.al. | 2310.15141v1 | null |
2023-10-20 | Neural-Base Music Generation for Intelligence Duplication | Jacob Galajda et.al. | 2310.13691v1 | null |
2023-10-20 | Exploring Linguistic Probes for Morphological Generalization | Jordan Kodner et.al. | 2310.13686v1 | null |
2023-10-20 | CAPIVARA: Cost-Efficient Approach for Improving Multilingual CLIP Performance on Low-Resource Languages | Gabriel Oliveira dos Santos et.al. | 2310.13683v1 | link |
2023-10-20 | Optimizing Retrieval-augmented Reader Models via Token Elimination | Moshe Berchansky et.al. | 2310.13682v1 | link |
2023-10-20 | Information Value: Measuring Utterance Predictability as Distance from Plausible Alternatives | Mario Giulianelli et.al. | 2310.13676v1 | link |
2023-10-20 | On Synthetic Data for Back Translation | Jiahao Xu et.al. | 2310.13675v1 | link |
2023-10-19 | HumanTOMATO: Text-aligned Whole-body Motion Generation | Shunlin Lu et.al. | 2310.12978v1 | null |
2023-10-19 | Training Dynamics of Deep Network Linear Regions | Ahmed Imtiaz Humayun et.al. | 2310.12977v1 | null |
2023-10-19 | Frozen Transformers in Language Models Are Effective Visual Encoder Layers | Ziqi Pang et.al. | 2310.12973v1 | link |
2023-10-19 | CCIL: Continuity-based Data Augmentation for Corrective Imitation Learning | Liyiming Ke et.al. | 2310.12972v1 | null |
2023-10-19 | CLAIR: Evaluating Image Captions with Large Language Models | David Chan et.al. | 2310.12971v1 | null |
2023-10-19 | Does Your Model Think Like an Engineer? Explainable AI for Bearing Fault Detection with Deep Learning | Thomas Decker et.al. | 2310.12967v1 | null |
2023-10-18 | Understanding Retrieval Augmentation for Long-Form Question Answering | Hung-Ting Chen et.al. | 2310.12150v1 | null |
2023-10-18 | Object-aware Inversion and Reassembly for Image Editing | Zhen Yang et.al. | 2310.12149v1 | null |
2023-10-18 | Simple Mechanisms for Representing, Indexing and Manipulating Concepts | Yuanzhi Li et.al. | 2310.12143v1 | null |
2023-10-17 | DELIFFAS: Deformable Light Fields for Fast Avatar Synthesis | Youngjoong Kwon et.al. | 2310.11449v1 | null |
2023-10-17 | Functional Invariants to Watermark Large Transformers | Fernandez Pierre et.al. | 2310.11446v1 | null |
2023-10-18 | EvalCrafter: Benchmarking and Evaluating Large Video Generation Models | Yaofang Liu et.al. | 2310.11440v2 | link |
2023-10-17 | Sadness, Anger, or Anxiety: Twitter Users’ Emotional Responses to Toxicity in Public Conversations | Ana Aleksandric et.al. | 2310.11436v1 | null |
2023-10-17 | An Empirical Study of Translation Hypothesis Ensembling with Large Language Models | António Farinhas et.al. | 2310.11430v1 | link |
2023-10-17 | Butterfly Effects of SGD Noise: Error Amplification in Behavior Cloning and Autoregression | Adam Block et.al. | 2310.11428v1 | null |
2023-10-17 | A Computational Framework for Solving Wasserstein Lagrangian Flows | Kirill Neklyudov et.al. | 2310.10649v2 | link |
2023-10-16 | Step-by-Step Remediation of Students’ Mathematical Mistakes | Rose E. Wang et.al. | 2310.10648v1 | link |
2023-10-16 | A Survey on Video Diffusion Models | Zhen Xing et.al. | 2310.10647v1 | link |
2023-10-16 | Interactive Task Planning with Language Models | Boyi Li et.al. | 2310.10645v1 | null |
2023-10-16 | TOSS:High-quality Text-guided Novel View Synthesis from a Single Image | Yukai Shi et.al. | 2310.10644v1 | null |
2023-10-16 | Real-time Photorealistic Dynamic Scene Representation and Rendering with 4D Gaussian Splatting | Zeyu Yang et.al. | 2310.10642v1 | link |
2023-10-16 | LLM Blueprint: Enabling Text-to-Image Generation with Complex and Detailed Prompts | Hanan Gani et.al. | 2310.10640v1 | link |
2023-10-16 | Zero-Shot Robotic Manipulation with Pretrained Image-Editing Diffusion Models | Kevin Black et.al. | 2310.10639v1 | link |
2023-10-13 | Vision-by-Language for Training-Free Compositional Image Retrieval | Shyamgopal Karthik et.al. | 2310.09291v1 | link |
2023-10-13 | Disentangled Latent Spaces Facilitate Data-Driven Auxiliary Learning | Geri Skenderi et.al. | 2310.09278v1 | null |
2023-10-13 | Retro-fallback: retrosynthetic planning in an uncertain world | Austin Tripp et.al. | 2310.09270v1 | null |
2023-10-13 | Genetic algorithms are strong baselines for molecule generation | Austin Tripp et.al. | 2310.09267v1 | null |
2023-10-13 | Towards End-to-end 4-Bit Inference on Generative Large Language Models | Saleh Ashkboos et.al. | 2310.09259v1 | link |
2023-10-12 | Octopus: Embodied Vision-Language Programmer from Environmental Feedback | Jingkang Yang et.al. | 2310.08588v1 | link |
2023-10-12 | Is Generalized Dynamic Novel View Synthesis from Monocular Videos Possible Today? | Xiaoming Zhao et.al. | 2310.08587v1 | null |
2023-10-12 | PonderV2: Pave the Way for 3D Foundataion Model with A Universal Pre-training Paradigm | Haoyi Zhu et.al. | 2310.08586v1 | link |
2023-10-12 | Discovering Fatigued Movements for Virtual Character Animation | Noshaba Cheema et.al. | 2310.08583v1 | null |
2023-10-12 | Tree-Planner: Efficient Close-loop Task Planning with Large Language Models | Mengkang Hu et.al. | 2310.08582v1 | null |
2023-10-12 | Universal Visual Decomposer: Long-Horizon Manipulation Made Easy | Zichen Zhang et.al. | 2310.08581v1 | null |
2023-10-12 | OmniControl: Control Any Joint at Any Time for Human Motion Generation | Yiming Xie et.al. | 2310.08580v1 | link |
2023-10-12 | HyperHuman: Hyper-Realistic Human Generation with Latent Structural Diffusion | Xian Liu et.al. | 2310.08579v1 | null |
2023-10-12 | Learning to Act from Actionless Videos through Dense Correspondences | Po-Chen Ko et.al. | 2310.08576v1 | null |
2023-10-12 | Jigsaw: Supporting Designers in Prototyping Multimodal Applications by Assembling AI Foundation Models | David Chuan-En Lin et.al. | 2310.08574v1 | null |
2023-10-11 | InstructRetro: Instruction Tuning post Retrieval-Augmented Pretraining | Boxin Wang et.al. | 2310.07713v1 | link |
2023-10-11 | ScaleCrafter: Tuning-free Higher-Resolution Visual Generation with Diffusion Models | Yingqing He et.al. | 2310.07702v1 | link |
2023-10-11 | Knowledge-enhanced Memory Model for Emotional Support Conversation | Mengzhao Jia et.al. | 2310.07700v1 | null |
2023-10-11 | From Scarcity to Efficiency: Improving CLIP Training via Visual-enriched Captions | Zhengfeng Lai et.al. | 2310.07699v1 | link |
2023-10-11 | SurroCBM: Concept Bottleneck Surrogate Models for Generative Post-hoc Explanation | Bo Pan et.al. | 2310.07698v1 | null |
2023-10-11 | ConditionVideo: Training-Free Condition-Guided Text-to-Video Generation | Bo Peng et.al. | 2310.07697v1 | link |
2023-10-11 | Large-scale photonic computing with nonlinear disordered media | Hao Wang et.al. | 2310.07690v1 | null |
2023-10-10 | AutoAD II: The Sequel – Who, When, and What in Movie Audio Description | Tengda Han et.al. | 2310.06838v1 | null |
2023-10-10 | Generating and Evaluating Tests for K-12 Students with Language Model Simulations: A Case Study on Sentence Reading Efficiency | Eric Zelikman et.al. | 2310.06837v1 | null |
2023-10-10 | What Does Stable Diffusion Know about the 3D Scene? | Guanqi Zhan et.al. | 2310.06836v1 | link |
2023-10-10 | Teaching Language Models to Hallucinate Less with Synthetic Tasks | Erik Jones et.al. | 2310.06827v1 | null |
2023-10-10 | Mistral 7B | Albert Q. Jiang et.al. | 2310.06825v1 | link |
2023-10-10 | The Geometry of Truth: Emergent Linear Structure in Large Language Model Representations of True/False Datasets | Samuel Marks et.al. | 2310.06824v1 | link |
2023-10-09 | Grokking as Compression: A Nonlinear Complexity Perspective | Ziming Liu et.al. | 2310.05918v1 | null |
2023-10-09 | Drivable Avatar Clothing: Faithful Full-Body Telepresence with Dynamic Clothing Driven by Sparse RGB-D Input | Donglai Xiang et.al. | 2310.05917v1 | null |
2023-10-09 | FireAct: Toward Language Agent Fine-tuning | Baian Chen et.al. | 2310.05915v1 | null |
2023-10-09 | SALMON: Self-Alignment with Principle-Following Reward Models | Zhiqing Sun et.al. | 2310.05910v1 | link |
2023-10-09 | Lion Secretly Solves Constrained Optimization: As Lyapunov Predicts | Lizhang Chen et.al. | 2310.05898v1 | null |
2023-10-06 | BrainSCUBA: Fine-Grained Natural Language Captions of Visual Cortex Selectivity | Andrew F. Luo et.al. | 2310.04420v1 | null |
2023-10-06 | Functional Interpolation for Relative Positions Improves Long Context Transformers | Shanda Li et.al. | 2310.04418v1 | null |
2023-10-09 | CIFAR-10-Warehouse: Broad and More Realistic Testbeds in Model Generalization Analysis | Xiaoxiao Sun et.al. | 2310.04414v2 | null |
2023-10-06 | FedConv: Enhancing Convolutional Neural Networks for Handling Data Heterogeneity in Federated Learning | Peiran Xu et.al. | 2310.04412v1 | link |
2023-10-06 | RECOMP: Improving Retrieval-Augmented LMs with Compression and Selective Augmentation | Fangyuan Xu et.al. | 2310.04408v1 | link |
2023-10-06 | Policy-Gradient Training of Language Models for Ranking | Ge Gao et.al. | 2310.04407v1 | null |
2023-10-06 | Language Agent Tree Search Unifies Reasoning Acting and Planning in Language Models | Andy Zhou et.al. | 2310.04406v1 | link |
2023-10-05 | ContactGen: Generative Contact Modeling for Grasp Generation | Shaowei Liu et.al. | 2310.03740v1 | null |
2023-10-05 | Aligning Text-to-Image Diffusion Models with Reward Backpropagation | Mihir Prabhudesai et.al. | 2310.03739v1 | link |
2023-10-05 | Stylist: Style-Driven Feature Ranking for Robust Novelty Detection | Stefan Smeu et.al. | 2310.03738v1 | link |
2023-10-05 | Leveraging Unpaired Data for Vision-Language Generative Models via Cycle Consistency | Tianhong Li et.al. | 2310.03734v1 | null |
2023-10-05 | MathCoder: Seamless Code Integration in LLMs for Enhanced Mathematical Reasoning | Ke Wang et.al. | 2310.03731v1 | link |
2023-10-05 | Stochastic interpolants with data-dependent couplings | Michael S. Albergo et.al. | 2310.03725v1 | null |
2023-10-04 | LanguageMPC: Large Language Models as Decision Makers for Autonomous Driving | Hao Sha et.al. | 2310.03026v1 | null |
2023-10-04 | Retrieval meets Long Context Large Language Models | Peng Xu et.al. | 2310.03025v1 | null |
2023-10-04 | Decision ConvFormer: Local Filtering in MetaFormer is Sufficient for Decision Making | Jeonghye Kim et.al. | 2310.03022v1 | null |
2023-10-04 | Consistent-1-to-3: Consistent Image to 3D View Synthesis via Geometry-aware Diffusion Models | Jianglong Ye et.al. | 2310.03020v1 | null |
2023-10-04 | Multimodal Question Answering for Unified Information Extraction | Yuxuan Sun et.al. | 2310.03017v1 | link |
2023-10-04 | Efficient-3DiM: Learning a Generalizable Single-image Novel-view Synthesizer in One Day | Yifan Jiang et.al. | 2310.03015v1 | null |
2023-10-04 | SemiReward: A General Reward Model for Semi-supervised Learning | Siyuan Li et.al. | 2310.03013v1 | link |
2023-10-04 | Towards Domain-Specific Features Disentanglement for Domain Generalization | Hao Chen et.al. | 2310.03007v1 | null |
2023-10-05 | COOLer: Class-Incremental Learning for Appearance-Based Multiple Object Tracking | Zhizheng Liu et.al. | 2310.03006v2 | link |
2023-10-03 | Generalizable Long-Horizon Manipulations with Large Language Models | Haoyu Zhou et.al. | 2310.02264v1 | null |
2023-10-03 | MathVista: Evaluating Mathematical Reasoning of Foundation Models in Visual Contexts | Pan Lu et.al. | 2310.02255v1 | null |
2023-10-03 | Talk2BEV: Language-enhanced Bird’s-eye View Maps for Autonomous Driving | Vikrant Dewangan et.al. | 2310.02251v1 | null |
2023-10-03 | Hierarchical Generation of Human-Object Interactions with Diffusion Probabilistic Models | Huaijin Pi et.al. | 2310.02242v1 | null |
2023-10-03 | MiniGPT-5: Interleaved Vision-and-Language Generation via Generative Vokens | Kaizhi Zheng et.al. | 2310.02239v1 | link |
2023-09-29 | Efficient Streaming Language Models with Attention Sinks | Guangxuan Xiao et.al. | 2309.17453v1 | link |
2023-10-02 | L2CEval: Evaluating Language-to-Code Generation Capabilities of Large Language Models | Ansong Ni et.al. | 2309.17446v2 | null |
2023-10-02 | LLM-grounded Video Diffusion Models | Long Lian et.al. | 2309.17444v2 | null |
2023-09-29 | CRAFT: Customizing LLMs by Creating and Retrieving from Specialized Toolsets | Lifan Yuan et.al. | 2309.17428v1 | link |
2023-09-28 | Learning to Transform for Generalizable Instance-wise Invariance | Utkarsh Singhal et.al. | 2309.16672v1 | link |
2023-09-29 | Demystifying CLIP Data | Hu Xu et.al. | 2309.16671v2 | link |
2023-09-28 | RealFill: Reference-Driven Generation for Authentic Image Completion | Luming Tang et.al. | 2309.16668v1 | null |
2023-09-28 | DreamGaussian: Generative Gaussian Splatting for Efficient 3D Content Creation | Jiaxiang Tang et.al. | 2309.16653v1 | link |
2023-09-27 | Exploiting the Signal-Leak Bias in Diffusion Models | Martin Nicolas Everaert et.al. | 2309.15842v1 | null |
2023-09-27 | OrthoPlanes: A Novel Representation for Better 3D-Awareness of GANs | Honglin He et.al. | 2309.15830v1 | null |
2023-09-27 | LGMCTS: Language-Guided Monte-Carlo Tree Search for Executable Semantic Object Rearrangement | Haonan Chang et.al. | 2309.15821v1 | null |
2023-09-27 | Show-1: Marrying Pixel and Latent Diffusion Models for Text-to-Video Generation | David Junhao Zhang et.al. | 2309.15818v1 | link |
2023-09-26 | Generating Visual Scenes from Touch | Fengyu Yang et.al. | 2309.15117v1 | null |
2023-09-27 | InternLM-XComposer: A Vision-Language Large Model for Advanced Text-image Comprehension and Composition | Pan Zhang et.al. | 2309.15112v2 | link |
2023-09-26 | Doduo: Learning Dense Visual Correspondence from Unsupervised Semantic-Aware Flow | Zhenyu Jiang et.al. | 2309.15110v1 | null |
2023-09-26 | DistillBEV: Boosting Multi-Camera 3D Object Detection with Cross-Modal Knowledge Distillation | Zeyu Wang et.al. | 2309.15109v1 | link |
2023-09-26 | New solution to Airy’s equation for modeling beams near turning points | N. A. Lopez et.al. | 2309.15108v1 | null |
2023-09-25 | Extreme Parkour with Legged Robots | Xuxin Cheng et.al. | 2309.14341v1 | null |
2023-09-25 | Chop & Learn: Recognizing and Generating Object-State Compositions | Nirat Saini et.al. | 2309.14339v1 | null |
2023-09-25 | UnitedHuman: Harnessing Multi-Source Data for High-Resolution Human Generation | Jianglin Fu et.al. | 2309.14335v1 | link |
2023-09-25 | Tasks Makyth Models: Machine Learning Assisted Surrogates for Tipping Points | Gianluca Fabiani et.al. | 2309.14334v1 | null |
2023-09-25 | Innovative Digital Storytelling with AIGC: Exploration and Discussion of Recent Advances | Rongzhang Gu et.al. | 2309.14329v1 | null |
2023-09-25 | pyParaOcean: A System for Visual Analysis of Ocean Data | Toshit Jain et.al. | 2309.14328v1 | null |
2023-09-22 | E(2)-Equivariant Graph Planning for Navigation | Linfeng Zhao et.al. | 2309.13043v1 | null |
2023-09-22 | MosaicFusion: Diffusion Models as Data Augmenters for Large Vocabulary Instance Segmentation | Jiahao Xie et.al. | 2309.13042v1 | link |
2023-09-22 | Robotic Offline RL from Internet Videos via Value-Function Pre-Training | Chethan Bhateja et.al. | 2309.13041v1 | null |
2023-09-22 | Privacy Assessment on Reconstructed Images: Are Existing Evaluation Metrics Faithful to Human Perception? | Xiaoxiao Sun et.al. | 2309.13038v1 | null |
2023-09-22 | GELLO: A General, Low-Cost, and Intuitive Teleoperation Framework for Robot Manipulators | Philipp Wu et.al. | 2309.13037v1 | null |
2023-09-22 | A numerical framework for simulating progressive failure in composite laminates under high-cycle fatigue loading | Pieter Hofman et.al. | 2309.13030v1 | null |
2023-09-21 | LLM-Grounder: Open-Vocabulary 3D Visual Grounding with Large Language Model as an Agent | Jianing Yang et.al. | 2309.12311v1 | null |
2023-09-21 | Rehearsal: Simulating Conflict to Teach Conflict Resolution | Omar Shaikh et.al. | 2309.12309v1 | null |
2023-09-21 | Text-Guided Vector Graphics Customization | Peiying Zhang et.al. | 2309.12302v1 | null |
2023-09-21 | Environment-biased Feature Ranking for Novelty Detection Robustness | Stefan Smeu et.al. | 2309.12301v1 | null |
2023-09-21 | Reranking for Natural Language Generation from Logical Forms: A Study based on Large Language Models | Levon Haroutunian et.al. | 2309.12294v1 | null |
2023-09-20 | A Large-scale Dataset for Audio-Language Representation Learning | Luoyi Sun et.al. | 2309.11500v1 | null |
2023-09-20 | DreamLLM: Synergistic Multimodal Comprehension and Creation | Runpei Dong et.al. | 2309.11499v1 | link |
2023-09-20 | FreeU: Free Lunch in Diffusion U-Net | Chenyang Si et.al. | 2309.11497v1 | link |
2023-09-20 | Chain-of-Verification Reduces Hallucination in Large Language Models | Shehzaad Dhuliawala et.al. | 2309.11495v1 | null |
2023-09-21 | Text2Reward: Automated Dense Reward Function Generation for Reinforcement Learning | Tianbao Xie et.al. | 2309.11489v2 | link |
2023-09-19 | PanopticNeRF-360: Panoramic 3D-to-2D Label Transfer in Urban Scenes | Xiao Fu et.al. | 2309.10815v1 | link |
2023-09-19 | Natural Language Embedded Programs for Hybrid Language Symbolic Reasoning | Tianhua Zhang et.al. | 2309.10814v1 | link |
2023-09-19 | PGDiff: Guiding Diffusion Models for Versatile Face Restoration via Partial Guidance | Peiqing Yang et.al. | 2309.10810v1 | link |
2023-09-20 | AI Foundation Models for Weather and Climate: Applications, Design, and Implementation | S. Karthik Mukkavilli et.al. | 2309.10808v2 | null |
2023-09-19 | Heuristic Search for Path Finding with Refuelling | Anushtup Nandy et.al. | 2309.10796v1 | null |
2023-09-19 | Guide Your Agent with Adaptive Multimodal Rewards | Changyeon Kim et.al. | 2309.10790v1 | link |
2023-09-18 | General In-Hand Object Rotation with Vision and Touch | Haozhi Qi et.al. | 2309.09979v1 | null |
2023-09-18 | GEDepth: Ground Embedding for Monocular Depth Estimation | Xiaodong Yang et.al. | 2309.09975v1 | link |
2023-09-19 | MindAgent: Emergent Gaming Interaction | Ran Gong et.al. | 2309.09971v2 | null |
2023-09-18 | Empirical Study of Mix-based Data Augmentation Methods in Physiological Time Series Data | Peikun Guo et.al. | 2309.09970v1 | link |
2023-09-18 | Prompt a Robot to Walk with Large Language Models | Yen-Jen Wang et.al. | 2309.09969v1 | link |
2023-09-18 | Generating and Imputing Tabular Data via Diffusion and Flow-based Gradient-Boosted Trees | Alexia Jolicoeur-Martineau et.al. | 2309.09968v1 | link |
2023-09-15 | Robust e-NeRF: NeRF from Sparse & Noisy Events under Non-Uniform Motion | Weng Fei Low et.al. | 2309.08596v1 | link |
2023-09-15 | Chain-of-Thought Reasoning is a Policy Improvement Operator | Hugh Zhang et.al. | 2309.08589v1 | null |
2023-09-15 | Robust Frame-to-Frame Camera Rotation Estimation in Crowded Scenes | Fabien Delattre et.al. | 2309.08588v1 | null |
2023-09-15 | Compositional Foundation Models for Hierarchical Planning | Anurag Ajay et.al. | 2309.08587v1 | null |
2023-09-15 | Viewpoint Integration and Registration with Vision Language Foundation Model for Image Change Understanding | Xiaonan Lu et.al. | 2309.08585v1 | null |
2023-09-15 | ICLEF: In-Context Learning with Expert Feedback for Explainable Style Transfer | Arkadiy Saakyan et.al. | 2309.08583v1 | link |
2023-09-15 | Large-Vocabulary 3D Diffusion Model with Transformer | Ziang Cao et.al. | 2309.07920v2 | null |
2023-09-14 | Unified Human-Scene Interaction via Prompted Chain-of-Contacts | Zeqi Xiao et.al. | 2309.07918v1 | link |
2023-09-14 | Looking at words and points with attention: a benchmark for text-to-shape coherence | Andrea Amaduzzi et.al. | 2309.07917v1 | null |
2023-09-14 | MMICL: Empowering Vision-language Model with Multi-Modal In-Context Learning | Haozhe Zhao et.al. | 2309.07915v1 | link |
2023-09-14 | ALWOD: Active Learning for Weakly-Supervised Object Detection | Yuting Wang et.al. | 2309.07914v1 | link |
2023-09-14 | Why would you put a flashlight in a dark matter detector? | R. Gibbons et.al. | 2309.07913v1 | null |
2023-09-14 | TEMPO: Efficient Multi-View Pose Estimation, Tracking, and Forecasting | Rohan Choudhury et.al. | 2309.07910v1 | null |
2023-09-14 | Physically Plausible Full-Body Hand-Object Interaction Synthesis | Jona Braun et.al. | 2309.07907v1 | null |
2023-09-14 | Generative Image Dynamics | Zhengqi Li et.al. | 2309.07906v1 | null |
2023-09-13 | Text-Guided Generation and Editing of Compositional 3D Avatars | Hao Zhang et.al. | 2309.07125v1 | null |
2023-09-13 | RAIN: Your Language Models Can Align Themselves without Finetuning | Yuhui Li et.al. | 2309.07124v1 | link |
2023-09-13 | Tree-Structured Shading Decomposition | Chen Geng et.al. | 2309.07122v1 | null |
2023-09-13 | Sight Beyond Text: Multi-Modal Training Enhances LLMs in Truthfulness and Ethics | Haoqin Tu et.al. | 2309.07120v1 | link |
2023-09-13 | Weakly-Supervised Multi-Task Learning for Audio-Visual Speaker Verification | Anith Selvakumar et.al. | 2309.07115v1 | null |
2023-09-13 | Contrastive Deep Encoding Enables Uncertainty-aware Machine-learning-assisted Histopathology | Nirhoshan Sivaroopan et.al. | 2309.07113v1 | null |
2023-09-13 | Hardening RGB-D Object Recognition Systems against Adversarial Patch Attacks | Yang Zheng et.al. | 2309.07106v1 | null |
2023-09-12 | Learning Disentangled Avatars with Hybrid 3D Representations | Yao Feng et.al. | 2309.06441v1 | null |
2023-09-12 | Unveiling the potential of large language models in generating semantic and cross-language clones | Palash R. Roy et.al. | 2309.06424v1 | null |
2023-09-12 | C4CAM: A Compiler for CAM-based In-memory Accelerators | Hamid Farzaneh et.al. | 2309.06418v1 | null |
2023-09-12 | Robot Parkour Learning | Ziwen Zhuang et.al. | 2309.05665v2 | null |
2023-09-11 | Diffusion-Guided Reconstruction of Everyday Hand-Object Interaction Clips | Yufei Ye et.al. | 2309.05663v1 | null |
2023-09-11 | ViHOPE: Visuotactile In-Hand Object 6D Pose Estimation with Shape Completion | Hongyu Li et.al. | 2309.05662v1 | null |
2023-09-11 | Hypothesis Search: Inductive Reasoning with Language Models | Ruocheng Wang et.al. | 2309.05660v1 | null |
2023-09-11 | From Capture to Display: A Survey on Volumetric Video | Yili Jin et.al. | 2309.05658v1 | null |
2023-09-11 | MAmmoTH: Building Math Generalist Models through Hybrid Instruction Tuning | Xiang Yue et.al. | 2309.05653v1 | null |
2023-09-11 | Data efficiency, dimensionality reduction, and the generalized symmetric information bottleneck | K. Michael Martini et.al. | 2309.05649v1 | null |
2023-09-08 | On the Actionability of Outcome Prediction | Lydia T. Liu et.al. | 2309.04470v1 | null |
2023-09-08 | Generalized Cross-domain Multi-label Few-shot Learning for Chest X-rays | Aroof Aimen et.al. | 2309.04462v1 | null |
2023-09-08 | Measuring and Improving Chain-of-Thought Reasoning in Vision-Language Models | Yangyi Chen et.al. | 2309.04461v1 | link |
2023-09-08 | Subwords as Skills: Tokenization for Sparse-Reward Reinforcement Learning | David Yunis et.al. | 2309.04459v1 | null |
2023-09-08 | Effect of Electron-Phonon Interactions on Three-Level QD-based Spaser: Linear and Quadratic Potentials | Ankit Purohit et.al. | 2309.04448v1 | null |
2023-09-07 | ImageBind-LLM: Multi-modality Instruction Tuning | Jiaming Han et.al. | 2309.03905v1 | link |
2023-09-07 | Exploring Sparse MoE in GANs for Text-conditioned Image Synthesis | Jiapeng Zhu et.al. | 2309.03904v1 | link |
2023-09-07 | Tracking Anything with Decoupled Video Segmentation | Ho Kei Cheng et.al. | 2309.03903v1 | link |
2023-09-07 | The Making and Breaking of Camouflage | Hala Lamdouar et.al. | 2309.03899v1 | null |
2023-09-07 | InstructDiffusion: A Generalist Modeling Interface for Vision Tasks | Zigang Geng et.al. | 2309.03895v1 | null |
2023-09-07 | DiffusionEngine: Diffusion Model is Scalable Data Engine for Object Detection | Manlin Zhang et.al. | 2309.03893v1 | null |
2023-09-07 | ArtiGrasp: Physically Plausible Synthesis of Bi-Manual Dexterous Grasping and Articulation | Hui Zhang et.al. | 2309.03891v1 | null |
2023-09-06 | My Art My Choice: Adversarial Protection Against Unruly AI | Anthony Rhodes et.al. | 2309.03198v1 | null |
2023-09-06 | Electrocaloric Response of the Dense Ferroelectric Nanocomposites | Anna N. Morozovska et.al. | 2309.03187v1 | null |
2023-09-06 | SLiMe: Segment Like Me | Aliasghar Khani et.al. | 2309.03179v1 | link |
2023-09-05 | ReliTalk: Relightable Talking Portrait Generation from a Single Video | Haonan Qiu et.al. | 2309.02434v1 | link |
2023-09-05 | Generating Realistic Images from In-the-wild Sounds | Taegyeong Lee et.al. | 2309.02405v1 | null |
2023-09-01 | OpenIns3D: Snap and Lookup for 3D Open-vocabulary Instance Segmentation | Zhening Huang et.al. | 2309.00616v1 | link |
2023-09-01 | Point-Bind & Point-LLM: Aligning Point Cloud with Multi-modality for 3D Understanding, Generation, and Instruction Following | Ziyu Guo et.al. | 2309.00615v1 | link |
2023-09-01 | Iterative Multi-granular Image Editing using Diffusion Models | K J Joseph et.al. | 2309.00613v1 | null |
2023-09-01 | CityDreamer: Compositional Generative Model of Unbounded 3D Cities | Haozhe Xie et.al. | 2309.00610v1 | link |
2023-09-01 | Copiloting the Copilots: Fusing Large Language Models with Completion Engines for Automated Program Repair | Yuxiang Wei et.al. | 2309.00608v1 | link |
2023-08-31 | PointLLM: Empowering Large Language Models to Understand Point Clouds | Runsen Xu et.al. | 2308.16911v1 | link |
2023-08-31 | StyleInV: A Temporal Style Modulated Inversion Network for Unconditional Video Generation | Yuhan Wang et.al. | 2308.16909v1 | link |
2023-08-31 | Fine-Grained Cross-View Geo-Localization Using a Correlation-Aware Homography Estimator | Xiaolong Wang et.al. | 2308.16906v1 | link |
2023-08-31 | InterDiff: Generating 3D Human-Object Interactions with Physics-Informed Diffusion | Sirui Xu et.al. | 2308.16905v1 | link |
2023-08-31 | Transformers as Support Vector Machines | Davoud Ataee Tarzanagh et.al. | 2308.16898v1 | link |
2023-09-01 | GNFactor: Multi-Task Real Robot Learning with Generalizable Neural Feature Fields | Yanjie Ze et.al. | 2308.16891v2 | link |
2023-08-31 | Prediction of Diblock Copolymer Morphology via Machine Learning | Hyun Park et.al. | 2308.16886v1 | null |
2023-08-30 | Learning Vision-based Pursuit-Evasion Robot Policies | Andrea Bajcsy et.al. | 2308.16185v1 | null |
2023-08-30 | SAM-Med2D | Junlong Cheng et.al. | 2308.16184v1 | link |
2023-08-30 | GREC: Generalized Referring Expression Comprehension | Shuting He et.al. | 2308.16182v1 | link |
2023-08-30 | Framework and Methodology for Verification of a Complex Scientific Simulation Software, Flash-X | Akash Dhruv et.al. | 2308.16180v1 | null |
2023-08-30 | General Purpose Audio Effect Removal | Matthew Rice et.al. | 2308.16177v1 | link |
2023-08-30 | Quantifying Uncertainty in Answers from any Language Model via Intrinsic and Extrinsic Confidence Assessment | Jiuhai Chen et.al. | 2308.16175v1 | null |
2023-08-29 | 3D Adversarial Augmentations for Robust Out-of-Domain Predictions | Alexander Lehner et.al. | 2308.15479v1 | null |
2023-08-29 | A General-Purpose Self-Supervised Model for Computational Pathology | Richard J. Chen et.al. | 2308.15474v1 | null |
2023-08-29 | Learning Modulated Transformation in GANs | Ceyuan Yang et.al. | 2308.15472v1 | link |
2023-08-29 | Input margins can predict generalization too | Coenraad Mouton et.al. | 2308.15466v1 | null |
2023-08-30 | Sharing proofs with predicative theories through universe polymorphic elaboration | Thiago Felicissimo et.al. | 2308.15465v2 | link |
2023-08-29 | ParaGuide: Guided Diffusion Paraphrasers for Plug-and-Play Textual Style Transfer | Zachary Horvitz et.al. | 2308.15459v1 | link |
2023-08-29 | From SMOTE to Mixup for Deep Imbalanced Classification | Wei-Chao Cheng et.al. | 2308.15457v1 | link |
2023-08-28 | AI Deception: A Survey of Examples, Risks, and Potential Solutions | Peter S. Park et.al. | 2308.14752v1 | null |
2023-08-28 | MagicAvatar: Multimodal Avatar Generation and Animation | Jianfeng Zhang et.al. | 2308.14748v1 | null |
2023-08-28 | CoVR: Learning Composed Video Retrieval from Web Video Captions | Lucas Ventura et.al. | 2308.14746v1 | link |
2023-08-28 | Advancement on Security Applications of Private Intersection Sum Protocol | Yuvaray Athur Raghuvir et.al. | 2308.14741v1 | null |
2023-08-28 | Total Selfie: Generating Full-Body Selfies | Bowei Chen et.al. | 2308.14740v1 | null |
2023-08-28 | Bayesian artificial brain with ChatGPT | Renato A. Krohling et.al. | 2308.14732v1 | null |
2023-08-28 | Distilled GPT for Source Code Summarization | Chia-Yi Su et.al. | 2308.14731v1 | link |
2023-08-25 | ChatGPT as Data Augmentation for Compositional Generalization: A Case Study in Open Intent Detection | Yihao Fang et.al. | 2308.13517v1 | link |
2023-08-25 | Does Asking Clarifying Questions Increases Confidence in Generated Code? On the Communication Skills of Large Language Models | Jie JW Wu et.al. | 2308.13507v1 | null |
2023-08-25 | A2Q: Accumulator-Aware Quantization with Guaranteed Overflow Avoidance | Ian Colbert et.al. | 2308.13504v1 | null |
2023-08-25 | Attending Generalizability in Course of Deep Fake Detection by Exploring Multi-task Learning | Pranav Balaji et.al. | 2308.13503v1 | null |
2023-08-24 | ROAM: Robust and Object-aware Motion Generation using Neural Pose Descriptors | Wanyue Zhang et.al. | 2308.12969v1 | null |
2023-08-24 | Dense Text-to-Image Generation with Attention Modulation | Yunji Kim et.al. | 2308.12964v1 | link |
2023-08-24 | MapPrior: Bird’s-Eye View Map Layout Estimation with Generative Models | Xiyue Zhu et.al. | 2308.12963v1 | null |
2023-08-24 | Motion-Guided Masking for Spatiotemporal Representation Learning | David Fan et.al. | 2308.12962v1 | null |
2023-08-24 | Less is More: Towards Efficient Few-shot 3D Semantic Segmentation via Training-free Networks | Xiangyang Zhu et.al. | 2308.12961v1 | link |
2023-08-24 | Towards Realistic Zero-Shot Classification via Self Structural Semantic Alignment | Sheng Zhang et.al. | 2308.12960v1 | link |
2023-08-24 | Semi-analytical Framework for Modeling Strong Coupling of Quantum Emitters in Electromagnetic Resonators | Mohammad Abutoama et.al. | 2308.12957v1 | null |
2023-08-24 | A new framework for global data regulation | Ellie Graeden et.al. | 2308.12955v1 | null |
2023-08-24 | BridgeData V2: A Dataset for Robot Learning at Scale | Homer Walke et.al. | 2308.12952v1 | link |
2023-08-24 | Label Budget Allocation in Multi-Task Learning | Ximeng Sun et.al. | 2308.12949v1 | null |
2023-08-23 | CHORUS: Learning Canonicalized 3D Human-Object Spatial Relations from Unbounded Synthesized Images | Sookwan Han et.al. | 2308.12288v1 | null |
2023-08-23 | Devising and Detecting Phishing: large language models vs. Smaller Human Models | Fredrik Heiding et.al. | 2308.12287v1 | null |
2023-08-23 | On-Manifold Projected Gradient Descent | Aaron Mahler et.al. | 2308.12279v1 | null |
2023-08-24 | A Model for Integrating Generative AI into Course Content Development | Ethan Dickey et.al. | 2308.12276v2 | null |
2023-08-23 | Spatial clustering of temporal energy profiles with empirical orthogonal functions and max-p regionalization | Claire Halloran et.al. | 2308.12274v1 | null |
2023-08-23 | Simple is Better and Large is Not Enough: Towards Ensembling of Foundational Language Models | Nancy Tyagi et.al. | 2308.12272v1 | null |
2023-08-23 | A Generative Approach for Image Registration of Visible-Thermal (VT) Cancer Faces | Catherine Ordun et.al. | 2308.12271v1 | null |
2023-08-23 | Language Reward Modulation for Pretraining Reinforcement Learning | Ademi Adeniji et.al. | 2308.12270v1 | link |
2023-08-22 | GRIP: Generating Interaction Poses Using Latent Consistency and Spatial Cues | Omid Taheri et.al. | 2308.11617v1 | null |
2023-08-22 | StoryBench: A Multifaceted Benchmark for Continuous Story Visualization | Emanuele Bugliarello et.al. | 2308.11606v1 | link |
2023-08-22 | GOPro: Generate and Optimize Prompts in CLIP using Self-Supervised Learning | Mainak Singha et.al. | 2308.11605v1 | null |
2023-08-22 | Towards Universal Interaction for Extended Reality | Pascal Knierim et.al. | 2308.11600v1 | null |
2023-08-22 | Theory of Transverse Mode Instability in Fiber Amplifiers with Multimode Excitations | Kabish Wisal et.al. | 2308.11599v1 | null |
2023-08-22 | Vision-Based Intelligent Robot Grasping Using Sparse Neural Network | Priya Shukla et.al. | 2308.11590v1 | null |
2023-08-21 | Structured World Models from Human Videos | Russell Mendonca et.al. | 2308.10901v1 | null |
2023-08-21 | TADA! Text to Animatable Digital Avatars | Tingting Liao et.al. | 2308.10899v1 | null |
2023-08-21 | Few-Shot Physically-Aware Articulated Mesh Generation via Hierarchical Deformation | Xueyi Liu et.al. | 2308.10898v1 | link |
2023-08-21 | Can Language Models Learn to Listen? | Evonne Ng et.al. | 2308.10897v1 | null |
2023-08-21 | Differentiable Shadow Mapping for Efficient Inverse Graphics | Markus Worchel et.al. | 2308.10896v1 | link |
2023-08-21 | Proton-Boron Fusion Yield Increased by Orders of Magnitude with Foam Targets | Wen-Qing Wei et.al. | 2308.10878v1 | null |
2023-08-21 | Analyzing Transformer Dynamics as Movement through Embedding Space | Sumeet S. Singh et.al. | 2308.10874v1 | null |
2023-08-18 | HumanLiff: Layer-wise 3D Human Generation with Diffusion Model | Shoukang Hu et.al. | 2308.09712v1 | null |
2023-08-18 | Robust Monocular Depth Estimation under Challenging Conditions | Stefano Gasperini et.al. | 2308.09711v1 | null |
2023-08-18 | SimDA: Simple Diffusion Adapter for Efficient Video Generation | Zhen Xing et.al. | 2308.09710v1 | null |
2023-08-18 | Training with Product Digital Twins for AutoRetail Checkout | Yue Yao et.al. | 2308.09708v1 | link |
2023-08-18 | Guide3D: Create 3D Avatars from Text and Image Guidance | Yukang Cao et.al. | 2308.09705v1 | null |
2023-08-18 | Counting and Sampling Labeled Chordal Graphs in Polynomial Time | Ursula Hebert-Johnson et.al. | 2308.09703v1 | null |
2023-08-16 | TeCH: Text-guided Reconstruction of Lifelike Clothed Humans | Yangyi Huang et.al. | 2308.08545v1 | link |
2023-08-16 | InsightMapper: A Closer Look at Inner-instance Information for Vectorized High-Definition Mapping | Zhenhua Xu et.al. | 2308.08543v1 | null |
2023-08-15 | Enumerating Tarski fixed points on lattices of binary relations | Julian Müller et.al. | 2308.07923v1 | null |
2023-08-15 | Solving Challenging Math Word Problems Using GPT-4 Code Interpreter with Code-based Self-Verification | Aojun Zhou et.al. | 2308.07921v1 | null |
2023-08-15 | The Regular Expression Inference Challenge | Mojtaba Valizadeh et.al. | 2308.07899v1 | null |
2023-08-15 | A Foundation LAnguage-Image model of the Retina (FLAIR): Encoding expert knowledge in text supervision | Julio Silva-Rodriguez et.al. | 2308.07898v1 | link |
2023-08-14 | Jurassic World Remake: Bringing Ancient Fossils Back to Life via Zero-Shot Long Image-to-Image Translation | Alexander Martin et.al. | 2308.07316v1 | link |
2023-08-14 | Reinforcing Security and Usability of Crypto-Wallet with Post-Quantum Cryptography and Zero-Knowledge Proof | Yathin Kethepalli et.al. | 2308.07309v1 | null |
2023-08-15 | LLM Self Defense: By Self Examination, LLMs Know They Are Being Tricked | Alec Helbling et.al. | 2308.07308v2 | null |
2023-08-14 | Extend Wave Function Collapse to Large-Scale Content Generation | Yuhe Nie et.al. | 2308.07307v1 | null |
2023-08-14 | Neural Authorship Attribution: Stylometric Analysis on Large Language Models | Tharindu Kumarage et.al. | 2308.07305v1 | link |
2023-08-14 | DiffSED: Sound Event Detection with Denoising Diffusion | Swapnil Bhosale et.al. | 2308.07293v1 | null |
2023-08-11 | Foundation Model is Efficient Multimodal Multitask Model Selector | Fanqing Meng et.al. | 2308.06262v1 | link |
2023-08-11 | Enhancing Network Management Using Code Generated by Large Language Models | Sathiya Kumaran Mani et.al. | 2308.06261v1 | link |
2023-08-11 | Self-Alignment with Instruction Backtranslation | Xian Li et.al. | 2308.06259v1 | null |
2023-08-11 | NEMA NU 2-2018 performance evaluation of a new generation digital 32-cm axial field-of-view Omni Legend PET-CT | Rhodri Lyn Smith et.al. | 2308.06255v1 | null |
2023-08-11 | Fundamental Limits on Subwavelength Range Resolution | Andrew N. Jordan et.al. | 2308.06252v1 | null |
2023-08-11 | ARGUS: Visualization of AI-Assisted Task Guidance in AR | Sonia Castelo et.al. | 2308.06246v1 | null |
2023-08-10 | PlankAssembly: Robust 3D Reconstruction from Three Orthographic Views with Learnt Shape Programs | Wentao Hu et.al. | 2308.05744v1 | link |
2023-08-10 | Neural Progressive Meshes | Yun-Chun Chen et.al. | 2308.05741v1 | null |
2023-08-10 | AudioLDM 2: Learning Holistic Audio Generation with Self-supervised Pretraining | Haohe Liu et.al. | 2308.05734v1 | link |
2023-08-10 | FrozenRecon: Pose-free 3D Scene Reconstruction with Frozen Depth Models | Guangkai Xu et.al. | 2308.05733v1 | null |
2023-08-09 | Scene-Generalizable Interactive Segmentation of Radiance Fields | Songlin Tang et.al. | 2308.05104v1 | null |
2023-08-09 | LayoutLLM-T2I: Eliciting Layout Guidance from LLM for Text-to-Image Generation | Leigang Qu et.al. | 2308.05095v1 | null |
2023-08-08 | SILO Language Models: Isolating Legal Risk In a Nonparametric Datastore | Sewon Min et.al. | 2308.04430v1 | link |
2023-08-08 | A Deep-Learning Method Using Auto-encoder and Generative Adversarial Network for Anomaly Detection on Ancient Stone Stele Surfaces | Yikun Liu et.al. | 2308.04426v1 | null |
2023-08-08 | Density-contrast induced inertial forces on particles in oscillatory flows | Siddhansh Agarwal et.al. | 2308.04423v1 | null |
2023-08-08 | Near-field 6G Networks: Why Mobile Terahertz Communications MUST Operate in the Near Field | Vitaly Petrov et.al. | 2308.04418v1 | null |
2023-08-08 | DiffCR: A Fast Conditional Diffusion Framework for Cloud Removal from Optical Satellite Images | Xuechao Zou et.al. | 2308.04417v1 | link |
2023-08-07 | FSD V2: Improving Fully Sparse 3D Object Detection with Virtual Voxels | Lue Fan et.al. | 2308.03755v1 | link |
2023-08-07 | Mask Frozen-DETR: High Quality Instance Segmentation with One GPU | Zhanhao Liang et.al. | 2308.03747v1 | null |
2023-08-07 | A Cost Analysis of Generative Language Models and Influence Operations | Micah Musser et.al. | 2308.03740v1 | link |
2023-08-07 | Labeling without Seeing? Blind Annotation for Privacy-Preserving Entity Resolution | Yixiang Yao et.al. | 2308.03734v1 | null |
2023-08-07 | SurvBeX: An explanation method of the machine learning survival models based on the Beran estimator | Lev V. Utkin et.al. | 2308.03730v1 | link |
2023-08-04 | Recovering non-Maxwellian particle velocity distribution functions from collective Thomson-scattered spectra | Bryan C. Foo et.al. | 2308.02488v1 | null |
2023-08-04 | Convolutions Die Hard: Open-Vocabulary Segmentation with Single Frozen Convolutional CLIP | Qihang Yu et.al. | 2308.02487v1 | link |
2023-08-04 | On the Inherent Anonymity of Gossiping | Rachid Guerraoui et.al. | 2308.02477v1 | null |
2023-08-04 | Towards Generalist Foundation Model for Radiology | Chaoyi Wu et.al. | 2308.02463v1 | link |
2023-08-04 | Getting the Ball Rolling: Learning a Dexterous Policy for a Biomimetic Tendon-Driven Hand with Rolling Contact Joints | Yasunori Toshimitsu et.al. | 2308.02453v1 | link |
2023-08-03 | The All-Seeing Project: Towards Panoptic Visual Recognition and Understanding of the Open World | Weiyun Wang et.al. | 2308.01907v1 | link |
2023-08-03 | Revisiting Deformable Convolution for Depth Completion | Xinglong Sun et.al. | 2308.01905v1 | link |
2023-08-03 | UniSim: A Neural Closed-Loop Sensor Simulator | Ze Yang et.al. | 2308.01898v1 | null |
2023-08-03 | Strategies for optimizing plasmonic grating couplers with topology-based inverse design | Michael Efseaff et.al. | 2308.01893v1 | null |
2023-08-02 | ELIXR: Towards a general purpose X-ray artificial intelligence system through alignment of large language models and radiology vision encoders | Shawn Xu et.al. | 2308.01317v1 | null |
2023-08-02 | Patched Denoising Diffusion Models For High-Resolution Image Synthesis | Zheng Ding et.al. | 2308.01316v1 | link |
2023-08-02 | More Context, Less Distraction: Visual Classification by Inferring and Conditioning on Contextual Attributes | Bang An et.al. | 2308.01313v1 | link |
2023-08-02 | TEASMA: A Practical Approach for the Test Assessment of Deep Neural Networks using Mutation Analysis | Amin Abbasishahkoo et.al. | 2308.01311v1 | null |
2023-08-02 | Revisiting DETR Pre-training for Object Detection | Yan Ma et.al. | 2308.01300v1 | null |
2023-08-01 | LISA: Reasoning Segmentation via Large Language Model | Xin Lai et.al. | 2308.00692v1 | link |
2023-08-01 | AnyLoc: Towards Universal Visual Place Recognition | Nikhil Keetha et.al. | 2308.00688v1 | link |
2023-08-01 | Learning from Hypervectors: A Survey on Hypervector Encoding | Sercan Aygun et.al. | 2308.00685v1 | null |
2023-07-31 | Conformal PID Control for Time Series Prediction | Anastasios N. Angelopoulos et.al. | 2307.16895v1 | link |
2023-07-31 | A reduced order model for geometrically parameterized two-scale simulations of elasto-plastic microstructures under large deformations | Theron Guo et.al. | 2307.16894v1 | null |
2023-07-31 | LEONARDO: A Pan-European Pre-Exascale Supercomputer for HPC and AI Applications | Matteo Turisini et.al. | 2307.16885v1 | null |
2023-07-31 | HAGRID: A Human-LLM Collaborative Dataset for Generative Information-Seeking with Attribution | Ehsan Kamalloo et.al. | 2307.16883v1 | link |
2023-07-31 | Image Synthesis under Limited Data: A Survey and Taxonomy | Mengping Yang et.al. | 2307.16879v1 | link |
2023-07-31 | Revisiting the Parameter Efficiency of Adapters from the Perspective of Precision Redundancy | Shibo Jie et.al. | 2307.16867v1 | link |
2023-07-28 | Uncertainty in Natural Language Generation: From Theory to Applications | Joris Baan et.al. | 2307.15703v1 | null |
2023-07-28 | The Strong Maximum Circulation Algorithm: A New Method for Aggregating Preference Rankings | Nathan Atkinson et.al. | 2307.15702v1 | null |
2023-07-31 | MeMOTR: Long-Term Memory-Augmented Transformer for Multi-Object Tracking | Ruopeng Gao et.al. | 2307.15700v2 | link |
2023-07-28 | PatchMixer: Rethinking network design to boost generalization for 3D point cloud understanding | Davide Boscaini et.al. | 2307.15692v1 | link |
2023-07-28 | Benchmarking Offline Reinforcement Learning on Real-Robot Hardware | Nico Gürtler et.al. | 2307.15690v1 | link |
2023-07-27 | PointOdyssey: A Large-Scale Synthetic Dataset for Long-Term Point Tracking | Yang Zheng et.al. | 2307.15055v1 | link |
2023-07-27 | A Geometric Notion of Causal Probing | Clément Guerner et.al. | 2307.15054v1 | null |
2023-07-27 | A Transformer-based Approach for Arabic Offline Handwritten Text Recognition | Saleh Momeni et.al. | 2307.15045v1 | null |
2023-07-27 | Universal and Transferable Adversarial Attacks on Aligned Language Models | Andy Zou et.al. | 2307.15043v1 | link |
2023-07-27 | 3-Coloring $C_4$ or $C_3$ -free Diameter Two Graphs | Tereza Klimošová et.al. | 2307.15036v1 | null |
2023-07-26 | WavJourney: Compositional Audio Creation with Large Language Models | Xubo Liu et.al. | 2307.14335v1 | link |
2023-07-26 | Towards Generalist Biomedical AI | Tao Tu et.al. | 2307.14334v1 | null |
2023-07-26 | Waypoint-Based Imitation Learning for Robotic Manipulation | Lucy Xiaoyang Shi et.al. | 2307.14326v1 | null |
2023-07-25 | Benchmarking and Analyzing Generative Data for Visual Recognition | Bo Li et.al. | 2307.13697v1 | null |
2023-07-25 | A Compact DAG for Storing and Searching Maximal Common Subsequences | Alessio Conte et.al. | 2307.13695v1 | null |
2023-07-25 | A Comprehensive Review of Recent Research Trends on UAVs | Kaled Telli et.al. | 2307.13691v1 | null |
2023-07-25 | Single reference treatment of strongly correlated H $4$ and H${10}$ isomers with Richardson-Gaudin states | Paul A. Johnson et.al. | 2307.13690v1 | null |
2023-07-25 | All-optical GeV electron bunch generation in a laser-plasma accelerator via truncated-channel injection | A. Picksley et.al. | 2307.13689v1 | null |
2023-07-25 | The Visual Language of Fabrics | Valentin Deschaintre et.al. | 2307.13681v1 | null |
2023-07-25 | High Probability Analysis for Non-Convex Stochastic Optimization with Clipping | Shaojie Li et.al. | 2307.13680v1 | null |
2023-07-24 | A Systematic Survey of Prompt Engineering on Vision-Language Foundation Models | Jindong Gu et.al. | 2307.12980v1 | link |
2023-07-24 | Evaluating the Ripple Effects of Knowledge Editing in Language Models | Roi Cohen et.al. | 2307.12976v1 | link |
2023-07-24 | Volcanic ash delimitation using Artificial Intelligence based on Pix2Pix | Christian Carrillo et.al. | 2307.12970v1 | null |
2023-07-24 | Aligning Large Language Models with Human: A Survey | Yufei Wang et.al. | 2307.12966v1 | link |
2023-07-24 | RLCD: Reinforcement Learning from Contrast Distillation for Language Model Alignment | Kevin Yang et.al. | 2307.12950v1 | link |
2023-07-24 | Boosting Punctuation Restoration with Data Generation and Reinforcement Learning | Viet Dac Lai et.al. | 2307.12949v1 | link |
2023-07-21 | Advancing Ad Auction Realism: Practical Insights & Modeling Implications | Ming Chen et.al. | 2307.11732v1 | null |
2023-07-21 | OUTFOX: LLM-generated Essay Detection through In-context Learning with Adversarially Generated Examples | Ryuto Koike et.al. | 2307.11729v1 | link |
2023-07-21 | Benchmark datasets for biomedical knowledge graphs with negative statements | Rita T. Sousa et.al. | 2307.11719v1 | null |
2023-07-20 | L-Eval: Instituting Standardized Evaluation for Long Context Language Models | Chenxin An et.al. | 2307.11088v1 | link |
2023-07-20 | AlignDet: Aligning Pre-training and Fine-tuning in Object Detection | Ming Li et.al. | 2307.11077v1 | link |
2023-07-20 | OBJECT 3DIT: Language-guided 3D-aware Image Editing | Oscar Michel et.al. | 2307.11073v1 | null |
2023-07-19 | Adversarial Latent Autoencoder with Self-Attention for Structural Image Synthesis | Jiajie Fan et.al. | 2307.10166v1 | null |
2023-07-19 | Rethinking Backdoor Attacks | Alaa Khaddaj et.al. | 2307.10163v1 | null |
2023-07-19 | Robust Driving Policy Learning with Guided Meta Reinforcement Learning | Kanghoon Lee et.al. | 2307.10160v1 | null |
2023-07-19 | FABRIC: Personalizing Diffusion Models with Iterative Feedback | Dimitri von Rütte et.al. | 2307.10159v1 | link |
2023-07-19 | Contact-aware Shaping and Maintenance of Deformable Linear Objects With Fixtures | Kejia Chen et.al. | 2307.10153v1 | null |
2023-07-18 | Forecasting the steam mass flow in a powerplant using the parallel hybrid network | Andrii Kurkin et.al. | 2307.09483v1 | null |
2023-07-18 | AnyDoor: Zero-shot Object-level Image Customization | Xi Chen et.al. | 2307.09481v1 | link |
2023-07-18 | ChatSpot: Bootstrapping Multimodal LLMs via Precise Referring Instruction Tuning | Liang Zhao et.al. | 2307.09474v1 | null |
2023-07-18 | Optimal Vehicle Trajectory Planning for Static Obstacle Avoidance using Nonlinear Optimization | Yajia Zhang et.al. | 2307.09466v1 | null |
2023-07-19 | Does Circuit Analysis Interpretability Scale? Evidence from Multiple Choice Capabilities in Chinchilla | Tom Lieberum et.al. | 2307.09458v2 | null |
2023-07-19 | A comparative analysis of SRGAN models | Fatemeh Rezapoor Nikroo et.al. | 2307.09456v2 | null |
2023-07-18 | Solving Knapsack with Small Items via L0-Proximity | Ce Jin et.al. | 2307.09454v1 | null |
2023-07-17 | Diffusion Models Beat GANs on Image Classification | Soumik Mukhopadhyay et.al. | 2307.08702v1 | null |
2023-07-17 | AlpaGasus: Training A Better Alpaca with Fewer Data | Lichang Chen et.al. | 2307.08701v1 | link |
2023-07-17 | Fast model inference and training on-board of Satellites | Vít Růžička et.al. | 2307.08700v1 | link |
2023-07-17 | Pair then Relation: Pair-Net for Panoptic Scene Graph Generation | Jinghao Wang et.al. | 2307.08699v1 | link |
2023-07-17 | Flow Matching in Latent Space | Quan Dao et.al. | 2307.08698v1 | link |
2023-07-17 | FlashAttention-2: Faster Attention with Better Parallelism and Work Partitioning | Tri Dao et.al. | 2307.08691v1 | link |
2023-07-17 | COLLIE: Systematic Construction of Constrained Text Generation Tasks | Shunyu Yao et.al. | 2307.08689v1 | link |
2023-07-14 | NIFTY: Neural Object Interaction Fields for Guided Human Motion Synthesis | Nilesh Kulkarni et.al. | 2307.07511v1 | null |
2023-07-14 | A Poisson Decomposition for Information and the Information-Event Diagram | Cheuk Ting Li et.al. | 2307.07506v1 | null |
2023-07-14 | Exhaustive Generation of Linear Orthogonal Cellular Automata | Enrico Formenti et.al. | 2307.07505v1 | null |
2023-07-14 | TALL: Thumbnail Layout for Deepfake Video Detection | Yuting Xu et.al. | 2307.07494v1 | link |
2023-07-14 | BehAVExplor: Behavior Diversity Guided Testing for Autonomous Driving Systems | Mingfei Cheng et.al. | 2307.07493v1 | null |
2023-07-14 | PseudoCal: A Source-Free Approach to Unsupervised Uncertainty Calibration in Domain Adaptation | Dapeng Hu et.al. | 2307.07489v1 | null |
2023-07-13 | HyperDreamBooth: HyperNetworks for Fast Personalization of Text-to-Image Models | Nataniel Ruiz et.al. | 2307.06949v1 | null |
2023-07-13 | Self-regulating Prompts: Foundational Model Adaptation without Forgetting | Muhammad Uzair Khattak et.al. | 2307.06948v1 | link |
2023-07-13 | In-context Autoencoder for Context Compression in a Large Language Model | Tao Ge et.al. | 2307.06945v1 | link |
2023-07-13 | InternVid: A Large-scale Video-Text Dataset for Multimodal Understanding and Generation | Yi Wang et.al. | 2307.06942v1 | link |
2023-07-13 | Animate-A-Story: Storytelling with Retrieval-Augmented Video Generation | Yingqing He et.al. | 2307.06940v1 | link |
2023-07-12 | Diagnosis, Feedback, Adaptation: A Human-in-the-Loop Framework for Test-Time Policy Adaptation | Andi Peng et.al. | 2307.06333v1 | null |
2023-07-12 | Deep Learning of Crystalline Defects from TEM images: A Solution for the Problem of “Never Enough Training Data” | Kishan Govind et.al. | 2307.06322v1 | null |
2023-07-12 | Facial Reenactment Through a Personalized Generator | Ariel Elazary et.al. | 2307.06307v1 | null |
2023-07-12 | Locally Adaptive Federated Learning via Stochastic Polyak Stepsizes | Sohom Mukherjee et.al. | 2307.06306v1 | link |
2023-07-11 | Scale Alone Does not Improve Mechanistic Interpretability in Vision Models | Roland S. Zimmermann et.al. | 2307.05471v1 | null |
2023-07-12 | My3DGen: Building Lightweight Personalized 3D Generative Model | Luchao Qi et.al. | 2307.05468v2 | null |
2023-07-11 | EgoVLPv2: Egocentric Video-Language Pre-training with Fusion in the Backbone | Shraman Pramanick et.al. | 2307.05463v1 | link |
2023-07-11 | Efficient 3D Articulated Human Generation with Layered Surface Volumes | Yinghao Xu et.al. | 2307.05462v1 | null |
2023-07-10 | Semantic-SAM: Segment and Recognize Anything at Any Granularity | Feng Li et.al. | 2307.04767v1 | link |
2023-07-10 | Learning Spatial Features from Audio-Visual Correspondence in Egocentric Videos | Sagnik Majumder et.al. | 2307.04760v1 | null |
2023-07-10 | Information decomposition to identify relevant variation in complex systems with machine learning | Kieran A. Murphy et.al. | 2307.04755v1 | link |
2023-07-10 | Shelving, Stacking, Hanging: Relational Pose Diffusion for Multi-modal Rearrangement | Anthony Simeonov et.al. | 2307.04751v1 | null |
2023-07-10 | Divide, Evaluate, and Refine: Evaluating and Improving Text-to-Image Alignment with Iterative VQA Feedback | Jaskirat Singh et.al. | 2307.04749v1 | null |
2023-07-07 | On the Efficacy of Sampling Adapters | Clara Meister et.al. | 2307.03749v1 | link |
2023-07-07 | Comparing Traditional and LLM-based Search for Consumer Choice: A Randomized Experiment | Sofia Eleni Spatharioti et.al. | 2307.03744v1 | null |
2023-07-07 | QIGen: Generating Efficient Kernels for Quantized Inference on Large Language Models | Tommaso Pegolotti et.al. | 2307.03738v1 | link |
2023-07-06 | Simulating Nelsonian Quantum Field Theory | Andrea Carosso et.al. | 2307.03188v1 | null |
2023-07-06 | Whisper-AT: Noise-Robust Automatic Speech Recognizers are Also Strong General Audio Event Taggers | Yuan Gong et.al. | 2307.03183v1 | link |
2023-07-06 | Markov Persuasion Processes with Endogenous Agent Beliefs | Krishnamurthy Iyer et.al. | 2307.03181v1 | null |
2023-07-07 | IPO-LDM: Depth-aided 360-degree Indoor RGB Panorama Outpainting via Latent Diffusion Model | Tianhao Wu et.al. | 2307.03177v2 | null |
2023-07-06 | Push Past Green: Learning to Look Behind Plant Foliage by Moving It | Xiaoyu Zhang et.al. | 2307.03175v1 | null |
2023-07-06 | Risk-Averse Trajectory Optimization via Sample Average Approximation | Thomas Lew et.al. | 2307.03167v1 | link |
2023-07-06 | VideoGLUE: Video General Understanding Evaluation of Foundation Models | Liangzhe Yuan et.al. | 2307.03166v1 | link |
2023-07-05 | LongNet: Scaling Transformers to 1,000,000,000 Tokens | Jiayu Ding et.al. | 2307.02486v1 | link |
2023-07-05 | Elastic Decision Transformer | Yueh-Hua Wu et.al. | 2307.02484v1 | link |
2023-07-05 | Jailbroken: How Does LLM Safety Training Fail? | Alexander Wei et.al. | 2307.02483v1 | null |
2023-07-05 | Reasoning or Reciting? Exploring the Capabilities and Limitations of Language Models Through Counterfactual Tasks | Zhaofeng Wu et.al. | 2307.02477v1 | link |
2023-07-05 | The Calissons Puzzle | Jean-Marie Favreau et.al. | 2307.02475v1 | null |
2023-07-06 | Deductive Additivity for Planning of Natural Language Proofs | Zayne Sprague et.al. | 2307.02472v2 | link |
2023-07-05 | What Matters in Training a GPT4-Style Language Model with Multimodal Inputs? | Yan Zeng et.al. | 2307.02469v1 | null |
2023-07-03 | Real-time Monocular Full-body Capture in World Space via Sequential Proxy-to-Motion Learning | Yuxiang Zhang et.al. | 2307.01200v1 | null |
2023-07-03 | NeuBTF: Neural fields for BTF encoding and transfer | Carlos Rodriguez-Pardo et.al. | 2307.01199v1 | null |
2023-07-03 | Improved sampling via learned diffusions | Lorenz Richter et.al. | 2307.01198v1 | null |
2023-07-03 | Segment Anything Meets Point Tracking | Frano Rajič et.al. | 2307.01197v1 | link |
2023-07-03 | Squeezing Large-Scale Diffusion Models for Mobile | Jiwoong Choi et.al. | 2307.01193v1 | null |
2023-07-03 | SAMAug: Point Prompt Augmentation for Segment Anything Model | Haixing Dai et.al. | 2307.01187v1 | link |
2023-07-03 | Continuously Red-Shift and Blue-Shift Wavelength-Tuneable, Narrowband, High Harmonics in the EUV - X-ray Regime for Resonance Imaging and Spectroscopies | Dimitar Popmintchev et.al. | 2307.01182v1 | null |
2023-06-30 | Hardwiring ViT Patch Selectivity into CNNs using Patch Mixing | Ariel N. Lee et.al. | 2306.17848v1 | null |
2023-06-30 | Magic123: One Image to High-Quality 3D Object Generation Using Both 2D and 3D Diffusion Priors | Guocheng Qian et.al. | 2306.17843v1 | link |
2023-07-03 | SPAE: Semantic Pyramid AutoEncoder for Multimodal Generation with Frozen LLMs | Lijun Yu et.al. | 2306.17842v2 | link |
2023-07-03 | Statler: State-Maintaining Language Models for Embodied Reasoning | Takuma Yoneda et.al. | 2306.17840v2 | null |
2023-06-30 | Federated Ensemble YOLOv5 - A Better Generalized Object Detection Algorithm | Vinit Hegiste et.al. | 2306.17829v1 | null |
2023-06-30 | Understanding Unfairness via Training Concept Influence | Yuanshun Yao et.al. | 2306.17828v1 | null |
2023-06-29 | An Efficient General-Purpose Modular Vision Model via Multi-Task Heterogeneous Training | Zitian Chen et.al. | 2306.17165v1 | null |
2023-06-30 | Generative AI for Programming Education: Benchmarking ChatGPT, GPT-4, and Human Tutors | Tung Phung et.al. | 2306.17156v2 | null |
2023-06-29 | Generate Anything Anywhere in Any Scene | Yuheng Li et.al. | 2306.17154v1 | null |
2023-06-28 | MultiZoo & MultiBench: A Standardized Toolkit for Multimodal Deep Learning | Paul Pu Liang et.al. | 2306.16413v1 | link |
2023-06-29 | Even order contributions to relative energies vanish for antisymmetric perturbations | O. Anatole von Lilienfeld et.al. | 2306.16409v2 | null |
2023-06-27 | Physion++: Evaluating Physical Scene Understanding that Requires Online Inference of Different Physical Properties | Hsiao-Yu Tung et.al. | 2306.15668v1 | null |
2023-06-28 | PoseDiffusion: Solving Pose Estimation via Diffusion-aided Bundle Adjustment | Jianyuan Wang et.al. | 2306.15667v2 | null |
2023-06-27 | SparseOptimizer: Sparsify Language Models through Moreau-Yosida Regularization and Accelerate through Compiler Co-design | Fu-Ming Guo et.al. | 2306.15656v1 | null |
2023-06-27 | Optimal Area-Sensitive Bounds for Polytope Approximation | Sunil Arya et.al. | 2306.15648v1 | null |
2023-06-26 | FunQA: Towards Surprising Video Comprehension | Binzhu Xie et.al. | 2306.14899v1 | link |
2023-06-27 | InterCode: Standardizing and Benchmarking Interactive Coding with Execution Feedback | John Yang et.al. | 2306.14898v2 | link |
2023-06-26 | Supervised Pretraining Can Learn In-Context Reinforcement Learning | Jonathan N. Lee et.al. | 2306.14892v1 | null |
2023-06-26 | Value of Information in Games with Multiple Strategic Information Providers | Raj Kiriti Velicheti et.al. | 2306.14886v1 | null |
2023-06-26 | Restart Sampling for Improving Generative Processes | Yilun Xu et.al. | 2306.14878v1 | link |
2023-06-26 | Geometry-Aware Approaches for Balancing Performance and Theoretical Guarantees in Linear Bandits | Yuwei Luo et.al. | 2306.14872v1 | null |
2023-06-26 | Composing Parameter-Efficient Modules with Arithmetic Operations | Jinghan Zhang et.al. | 2306.14870v1 | link |
2023-06-23 | GKD: Generalized Knowledge Distillation for Auto-regressive Sequence Models | Rishabh Agarwal et.al. | 2306.13649v1 | null |
2023-06-23 | Offline Skill Graph (OSG): A Framework for Learning and Planning using Offline Reinforcement Learning Skills | Ben-ya Halevy et.al. | 2306.13630v1 | null |
2023-06-22 | Evading Forensic Classifiers with Attribute-Conditioned Adversarial Faces | Fahad Shamshad et.al. | 2306.13091v1 | link |
2023-06-22 | PromptIR: Prompting for All-in-One Blind Image Restoration | Vaishnav Potlapalli et.al. | 2306.13090v1 | link |
2023-06-22 | Improved Signal Detection for Ambient Backscatter Communications | S. Zargari et.al. | 2306.13083v1 | null |
2023-06-21 | VisoGender: A dataset for benchmarking gender bias in image-text pronoun resolution | Siobhan Mackenzie Hall et.al. | 2306.12424v1 | link |
2023-06-21 | Benchmarking and Analyzing 3D-aware Image Synthesis with a Modularized Codebase | Qiuyu Wang et.al. | 2306.12423v1 | link |
2023-06-21 | LMFlow: An Extensible Toolkit for Finetuning and Inference of Large Foundation Models | Shizhe Diao et.al. | 2306.12420v1 | link |
2023-06-21 | Coqlex: Generating Formally Verified Lexers | Wendlasida Ouedraogo et.al. | 2306.12411v1 | null |
2023-06-20 | Learning Profitable NFT Image Diffusions via Multiple Visual-Policy Guided Reinforcement Learning | Huiguo He et.al. | 2306.11731v1 | null |
2023-06-20 | Dense Video Object Captioning from Disjoint Supervision | Xingyi Zhou et.al. | 2306.11729v1 | link |
2023-06-20 | Diffusion with Forward Models: Solving Stochastic Inverse Problems Without Direct Supervision | Ayush Tewari et.al. | 2306.11719v1 | null |
2023-06-20 | Multi-Fidelity Active Learning with GFlowNets | Alex Hernandez-Garcia et.al. | 2306.11715v1 | link |
2023-06-20 | Data-Driven but Privacy-Conscious: Pedestrian Dataset De-identification via Full-Body Person Synthesis | Maxim Maximov et.al. | 2306.11710v1 | null |
2023-06-16 | Just One Byte (per gradient): A Note on Low-Bandwidth Decentralized Language Model Finetuning Using Shared Randomness | Eric Zelikman et.al. | 2306.10015v1 | link |
2023-06-20 | CLIP2Protect: Protecting Facial Privacy using Text-Guided Makeup via Adversarial Latent Search | Fahad Shamshad et.al. | 2306.10008v2 | link |
2023-06-16 | C2F2NeUS: Cascade Cost Frustum Fusion for High Fidelity and Generalizable Neural Surface Reconstruction | Luoyuan Xu et.al. | 2306.10003v1 | null |
2023-06-16 | SLACK: Stable Learning of Augmentations with Cold-start and KL regularization | Juliette Marrie et.al. | 2306.09998v1 | null |
2023-06-16 | Fairness in Preference-based Reinforcement Learning | Umer Siddique et.al. | 2306.09995v1 | null |
2023-06-16 | Rosetta Neurons: Mining the Common Units in a Model Zoo | Amil Dravid et.al. | 2306.09346v2 | null |
2023-06-15 | Evaluating Data Attribution for Text-to-Image Models | Sheng-Yu Wang et.al. | 2306.09345v1 | link |
2023-06-15 | DreamSim: Learning New Dimensions of Human Visual Similarity using Synthetic Data | Stephanie Fu et.al. | 2306.09344v1 | link |
2023-06-15 | Human Preference Score v2: A Solid Benchmark for Evaluating Human Preferences of Text-to-Image Synthesis | Xiaoshi Wu et.al. | 2306.09341v1 | link |
2023-06-15 | Span-Selective Linear Attention Transformers for Effective and Robust Schema-Guided Dialogue State Tracking | Björn Bebensee et.al. | 2306.09340v1 | null |
2023-06-15 | From BERT to GPT-3 Codex: Harnessing the Potential of Very Large Language Models for Data Management | Immanuel Trummer et.al. | 2306.09339v1 | null |
2023-06-15 | Generative Proxemics: A Prior for 3D Social Interaction from Images | Lea Müller et.al. | 2306.09337v1 | link |
2023-06-15 | Fit Like You Sample: Sample-Efficient Generalized Score Matching from Fast Mixing Markov Chains | Yilong Qin et.al. | 2306.09332v1 | null |
2023-06-15 | ArtFusion: Arbitrary Style Transfer using Dual Conditional Latent Diffusion Models | Dar-Yen Chen et.al. | 2306.09330v1 | link |
2023-06-13 | XrayGPT: Chest Radiographs Summarization using Medical Vision-Language Models | Omkar Thawkar et.al. | 2306.07971v1 | link |
2023-06-13 | GeneCIS: A Benchmark for General Conditional Image Similarity | Sagar Vaze et.al. | 2306.07969v1 | null |
2023-06-13 | One-for-All: Generalized LoRA for Parameter-Efficient Fine-tuning | Arnav Chavan et.al. | 2306.07967v1 | link |
2023-06-13 | Rerender A Video: Zero-Shot Text-Guided Video-to-Video Translation | Shuai Yang et.al. | 2306.07954v1 | null |
2023-06-12 | Waffling around for Performance: Visual Classification with Random Words and Broad Concepts | Karsten Roth et.al. | 2306.07282v1 | link |
2023-06-12 | Controlling Text-to-Image Diffusion by Orthogonal Finetuning | Zeju Qiu et.al. | 2306.07280v1 | null |
2023-06-12 | Scalable 3D Captioning with Pretrained Models | Tiange Luo et.al. | 2306.07279v1 | link |
2023-06-12 | Mathematical conjecture generation using machine intelligence | Challenger Mishra et.al. | 2306.07277v1 | null |
2023-06-12 | Operator Learning with Neural Fields: Tackling PDEs on General Geometries | Louis Serrano et.al. | 2306.07266v1 | link |
2023-06-12 | On the Collocated Form with Input Decoupling of Lagrangian Systems | Pietro Pustina et.al. | 2306.07258v1 | null |
2023-06-09 | Leveraging Large Language Models for Scalable Vector Graphics-Driven Image Understanding | Mu Cai et.al. | 2306.06094v1 | null |
2023-06-09 | HyP-NeRF: Learning Improved NeRF Priors using a HyperNetwork | Bipasha Sen et.al. | 2306.06093v1 | null |
2023-06-09 | Computational Flash Photography through Intrinsics | Sepideh Sarajian Maralan et.al. | 2306.06089v1 | null |
2023-06-09 | SENS: Sketch-based Implicit Neural Shape Modeling | Alexandre Binninger et.al. | 2306.06088v1 | null |
2023-06-09 | Learning Not to Spoof | David Byrd et.al. | 2306.06087v1 | null |
2023-06-09 | Developing Speech Processing Pipelines for Police Accountability | Anjalie Field et.al. | 2306.06086v1 | null |
2023-06-08 | Background Prompting for Improved Object Depth | Manel Baradad et.al. | 2306.05428v1 | null |
2023-06-08 | Grounded Text-to-Image Synthesis with Attention Refocusing | Quynh Phung et.al. | 2306.05427v1 | null |
2023-06-08 | SequenceMatch: Imitation Learning for Autoregressive Sequence Modelling with Backtracking | Chris Cundy et.al. | 2306.05426v1 | null |
2023-06-08 | MIMIC-IT: Multi-Modal In-Context Instruction Tuning | Bo Li et.al. | 2306.05425v1 | link |
2023-06-08 | Video-ChatGPT: Towards Detailed Video Understanding via Large Vision and Language Models | Muhammad Maaz et.al. | 2306.05424v1 | link |
2023-06-08 | ADDP: Learning General Representations for Image Recognition and Generation with Alternating Denoising Diffusion Process | Changyao Tian et.al. | 2306.05423v1 | null |
2023-06-08 | Stochastic Multi-Person 3D Motion Forecasting | Sirui Xu et.al. | 2306.05421v1 | link |
2023-06-08 | Scaling Spherical CNNs | Carlos Esteves et.al. | 2306.05420v1 | link |
2023-06-08 | 2D Supervised Monocular 3D Object Detection by Global-to-Local 3D Reconstruction | Jiawei He et.al. | 2306.05418v1 | null |
2023-06-07 | Transformers as Statisticians: Provable In-Context Learning with In-Context Algorithm Selection | Yu Bai et.al. | 2306.04637v1 | link |
2023-06-07 | GP-UNIT: Generative Prior for Versatile Unsupervised Image-to-Image Translation | Shuai Yang et.al. | 2306.04636v1 | link |
2023-06-07 | On the Reliability of Watermarks for Large Language Models | John Kirchenbauer et.al. | 2306.04634v1 | link |
2023-06-07 | Designing a Better Asymmetric VQGAN for StableDiffusion | Zixin Zhu et.al. | 2306.04632v1 | link |
2023-06-07 | Goal-conditioned GFlowNets for Controllable Multi-Objective Molecular Design | Julien Roy et.al. | 2306.04620v1 | null |
2023-06-07 | Helicity-dependent optical control of the magnetization state emerging from the Landau-Lifshitz-Gilbert equation | Benjamin Assouline et.al. | 2306.04617v1 | null |
2023-06-07 | ChatDB: Augmenting LLMs with Databases as Their Symbolic Memory | Chenxu Hu et.al. | 2306.03901v2 | null |
2023-06-06 | Model Spider: Learning to Rank Pre-Trained Models Efficiently | Yi-Kai Zhang et.al. | 2306.03900v1 | null |
2023-06-06 | Towards Label-free Scene Understanding by Vision Foundation Models | Runnan Chen et.al. | 2306.03899v1 | link |
2023-06-05 | Is ChatGPT a Good Teacher Coach? Measuring Zero-Shot Performance For Scoring and Providing Actionable Insights on Classroom Instruction | Rose E. Wang et.al. | 2306.03090v1 | link |
2023-06-05 | Brain Diffusion for Visual Exploration: Cortical Discovery using Large Scale Generative Models | Andrew F. Luo et.al. | 2306.03089v1 | null |
2023-06-05 | DeepGraphDMD: Interpretable Spatio-Temporal Decomposition of Non-linear Functional Brain Network Dynamics | Md Asadullah Turja et.al. | 2306.03088v1 | link |
2023-06-05 | MotionDiffuser: Controllable Multi-Agent Motion Prediction using Diffusion | Chiyu Max Jiang et.al. | 2306.03083v1 | null |
2023-06-05 | InstructZero: Efficient Instruction Optimization for Black-Box Large Language Models | Lichang Chen et.al. | 2306.03082v1 | link |
2023-06-05 | Sequential Monte Carlo Steering of Large Language Models using Probabilistic Programs | Alexander K. Lew et.al. | 2306.03081v1 | link |
2023-06-05 | A General Perspective on Objectives of Reinforcement Learning | Long Yang et.al. | 2306.03074v1 | null |
2023-06-05 | Explore to Generalize in Zero-Shot RL | Ev Zisselman et.al. | 2306.03072v1 | link |
2023-06-02 | Multilingual Conceptual Coverage in Text-to-Image Models | Michael Saxon et.al. | 2306.01735v1 | link |
2023-06-02 | DocFormerv2: Local Features for Document Understanding | Srikar Appalaraju et.al. | 2306.01733v1 | null |
2023-06-02 | Video Colorization with Pre-trained Text-to-Image Diffusion Models | Hanyuan Liu et.al. | 2306.01732v1 | null |
2023-06-02 | Improving Generalization in Task-oriented Dialogues with Workflows and Action Plans | Stefania Raimondo et.al. | 2306.01729v1 | null |
2023-06-02 | Denoising Diffusion Semantic Segmentation with Mask Prior Modeling | Zeqiang Lai et.al. | 2306.01721v1 | link |
2023-06-02 | Fresh Content Needs More Attention: Multi-funnel Fresh Content Recommendation | Jianling Wang et.al. | 2306.01720v1 | null |
2023-06-02 | Discreteness of asymptotic tensor ranks | Jop Briët et.al. | 2306.01718v1 | null |
2023-06-01 | StyleGAN knows Normal, Depth, Albedo, and More | Anand Bhattad et.al. | 2306.00987v1 | null |
2023-06-02 | Diffusion Self-Guidance for Controllable Image Generation | Dave Epstein et.al. | 2306.00986v2 | null |
2023-06-01 | StableRep: Synthetic Images from Text-to-Image Models Make Strong Visual Representation Learners | Yonglong Tian et.al. | 2306.00984v1 | link |
2023-06-01 | StyleDrop: Text-to-Image Generation in Any Style | Kihyuk Sohn et.al. | 2306.00983v1 | null |
2023-06-01 | SnapFusion: Text-to-Image Diffusion Model on Mobile Devices within Two Seconds | Yanyu Li et.al. | 2306.00980v1 | link |
2023-06-01 | AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration | Ji Lin et.al. | 2306.00978v1 | link |
2023-06-01 | Intriguing Properties of Text-guided Diffusion Models | Qihao Liu et.al. | 2306.00974v1 | link |
2023-06-01 | Intelligent Grimm – Open-ended Visual Storytelling via Latent Diffusion Models | Chang Liu et.al. | 2306.00973v1 | link |
2023-06-01 | Too Large; Data Reduction for Vision-Language Pre-Training | Alex Jinpeng Wang et.al. | 2305.20087v2 | link |
2023-05-31 | Understanding and Mitigating Copying in Diffusion Models | Gowthami Somepalli et.al. | 2305.20086v1 | link |
2023-05-31 | Control4D: Dynamic Portrait Editing by Learning 4D GAN from 2D Diffusion-based Editor | Ruizhi Shao et.al. | 2305.20082v1 | null |
2023-05-31 | On the Capacity of Secure $K$ -user Product Computation over a Quantum MAC | Yuxiang Lu et.al. | 2305.20073v1 | null |
2023-05-31 | Latent Exploration for Reinforcement Learning | Alberto Silvio Chiappa et.al. | 2305.20065v1 | link |
2023-05-31 | Chatting Makes Perfect – Chat-based Image Retrieval | Matan Levy et.al. | 2305.20062v1 | link |
2023-05-30 | Concise Answers to Complex Questions: Summarization of Long-form Answers | Abhilash Potluri et.al. | 2305.19271v1 | link |
2023-05-30 | Microfluidics Generation of Millimeter-sized Matrigel Droplets | Cory Arnold et.al. | 2305.19261v1 | null |
2023-05-30 | Shuffle SGD is Always Better than SGD: Improved Analysis of SGD with Arbitrary Data Orders | Anastasia Koloskova et.al. | 2305.19259v1 | null |
2023-05-30 | Ambient Diffusion: Learning Clean Distributions from Corrupted Data | Giannis Daras et.al. | 2305.19256v1 | link |
2023-05-30 | What Can We Learn from Unlearnable Datasets? | Pedro Sandoval-Segura et.al. | 2305.19254v1 | link |
2023-05-29 | RAPHAEL: Text-to-Image Generation via Large Mixture of Diffusion Paths | Zeyue Xue et.al. | 2305.18295v1 | null |
2023-05-29 | Transformer Language Models Handle Word Frequency in Prediction Head | Goro Kobayashi et.al. | 2305.18294v1 | null |
2023-05-29 | Direct Preference Optimization: Your Language Model is Secretly a Reward Model | Rafael Rafailov et.al. | 2305.18290v1 | link |
2023-05-29 | LaFTer: Label-Free Tuning of Zero-shot Classifier using Language and Unlabeled Image Collections | M. Jehanzeb Mirza et.al. | 2305.18287v1 | null |
2023-05-29 | Characterization and evasion of backscattered light in the squeezed-light enhanced gravitational wave interferometer GEO 600 | Fabio Bergamin et.al. | 2305.18284v1 | null |
2023-05-29 | Contextual Object Detection with Multimodal Large Language Models | Yuhang Zang et.al. | 2305.18279v1 | link |
2023-05-26 | NeuManifold: Neural Watertight Manifold Reconstruction with Efficient and High-Quality Rendering Support | Xinyue Wei et.al. | 2305.17134v1 | null |
2023-05-26 | RAMP: Retrieval and Attribute-Marking Enhanced Prompting for Attribute-Controlled Translation | Gabriele Sarti et.al. | 2305.17131v1 | null |
2023-05-26 | Characterizing and Measuring Linguistic Dataset Drift | Tyler A. Chang et.al. | 2305.17127v1 | link |
2023-05-26 | Large Language Models as Tool Makers | Tianle Cai et.al. | 2305.17126v1 | link |
2023-05-26 | Manifold Regularization for Memory-Efficient Training of Deep Neural Networks | Shadi Sartipi et.al. | 2305.17119v1 | null |
2023-05-26 | Scissorhands: Exploiting the Persistence of Importance Hypothesis for LLM KV Cache Compression at Test Time | Zichang Liu et.al. | 2305.17118v1 | null |
2023-05-26 | Improving accuracy of GPT-3/4 results on biomedical data using a retrieval-augmented language model | David Soong et.al. | 2305.17116v1 | null |
2023-05-25 | Uni-ControlNet: All-in-One Control to Text-to-Image Diffusion Models | Shihao Zhao et.al. | 2305.16322v1 | link |
2023-05-25 | Parallel Sampling of Diffusion Models | Andy Shih et.al. | 2305.16317v1 | link |
2023-05-25 | NAP: Neural 3D Articulation Prior | Jiahui Lei et.al. | 2305.16315v1 | null |
2023-05-26 | Banana: Banach Fixed-Point Network for Pointcloud Segmentation with Inter-Part Equivariance | Congyue Deng et.al. | 2305.16314v2 | null |
2023-05-25 | UMat: Uncertainty-Aware Single Image High Resolution Material Capture | Carlos Rodriguez-Pardo et.al. | 2305.16312v1 | null |
2023-05-25 | Break-A-Scene: Extracting Multiple Concepts from a Single Image | Omri Avrahami et.al. | 2305.16311v1 | link |
2023-05-25 | Securing Deep Generative Models with Universal Adversarial Signature | Yu Zeng et.al. | 2305.16310v1 | link |
2023-05-25 | Imitating Task and Motion Planning with Visuomotor Transformers | Murtaza Dalal et.al. | 2305.16309v1 | null |
2023-05-25 | Fine-Grained Complexity Analysis of Multi-Agent Path Finding on 2D Grids | Tzvika Geft et.al. | 2305.16303v1 | null |
2023-05-24 | Towards Revealing the Mystery behind Chain of Thought: a Theoretical Perspective | Guhao Feng et.al. | 2305.15408v1 | link |
2023-05-24 | Balancing the Picture: Debiasing Vision-Language Datasets with Synthetic Contrast Sets | Brandon Smith et.al. | 2305.15407v1 | link |
2023-05-24 | Sin3DM: Learning a Diffusion Model from a Single 3D Textured Shape | Rundi Wu et.al. | 2305.15399v1 | link |
2023-05-24 | LayoutGPT: Compositional Visual Planning and Generation with Large Language Models | Weixi Feng et.al. | 2305.15393v1 | link |
2023-05-24 | A Neural Space-Time Representation for Text-to-Image Personalization | Yuval Alaluf et.al. | 2305.15391v1 | link |
2023-05-24 | Peek Across: Improving Multi-Document Modeling via Cross-Document Question-Answering | Avi Caciularu et.al. | 2305.15387v1 | link |
2023-05-23 | NCHO: Unsupervised Learning for Neural 3D Composition of Humans and Objects | Taeksoo Kim et.al. | 2305.14345v1 | link |
2023-05-23 | Video Prediction Models as Rewards for Reinforcement Learning | Alejandro Escontrela et.al. | 2305.14343v1 | null |
2023-05-23 | APPLS: A Meta-evaluation Testbed for Plain Language Summarization | Yue Guo et.al. | 2305.14341v1 | link |
2023-05-23 | Diffusion Hyperfeatures: Searching Through Time and Space for Semantic Correspondence | Grace Luo et.al. | 2305.14334v1 | null |
2023-05-23 | Evaluating and Modeling Attribution for Cross-Lingual Question Answering | Benjamin Muller et.al. | 2305.14332v1 | null |
2023-05-23 | Large Language Models are Frame-level Directors for Zero-shot Text-to-Video Generation | Susung Hong et.al. | 2305.14330v1 | link |
2023-05-23 | Zero-sum Polymatrix Markov Games: Equilibrium Collapse and Efficient Computation of Nash Equilibria | Fivos Kalogiannis et.al. | 2305.14329v1 | null |
2023-05-23 | Dynosaur: A Dynamic Growth Paradigm for Instruction-Tuning Data Curation | Da Yin et.al. | 2305.14327v1 | link |
2023-05-22 | Contextualising Implicit Representations for Semantic Tasks | Theo W. Costain et.al. | 2305.13312v1 | null |
2023-05-22 | VDT: An Empirical Study on Video Diffusion with Transformers | Haoyu Lu et.al. | 2305.13311v1 | link |
2023-05-22 | Matcher: Segment Anything with One Shot Using All-Purpose Feature Matching | Yang Liu et.al. | 2305.13310v1 | link |
2023-05-22 | Evaluating Factual Consistency of Texts with Semantic Role Labeling | Jing Fan et.al. | 2305.13309v1 | link |
2023-05-22 | If at First You Don’t Succeed, Try, Try Again: Faithful Diffusion-based Text-to-Image Generation by Selection | Shyamgopal Karthik et.al. | 2305.13308v1 | link |
2023-05-22 | NeRFuser: Large-Scale Scene Representation by NeRF Fusion | Jiading Fang et.al. | 2305.13307v1 | link |
2023-05-22 | Growth of ultrawide-bandgap BN/diamond heterostructures by pulsed laser deposition | Abhijit Biswas et.al. | 2305.13306v1 | null |
2023-05-22 | RecurrentGPT: Interactive Generation of (Arbitrarily) Long Text | Wangchunshu Zhou et.al. | 2305.13304v1 | link |
2023-05-23 | Training Diffusion Models with Reinforcement Learning | Kevin Black et.al. | 2305.13301v2 | link |
2023-05-22 | Measuring Inductive Biases of In-Context Learning with Underspecified Demonstrations | Chenglei Si et.al. | 2305.13299v1 | link |
2023-05-19 | Chupa: Carving 3D Clothed Humans from Skinned Shape Priors using 2D Diffusion Probabilistic Models | Byungjun Kim et.al. | 2305.11870v1 | link |
2023-05-19 | Reducing Sequence Length by Predicting Edit Operations with Large Language Models | Masahiro Kaneko et.al. | 2305.11862v1 | null |
2023-05-19 | Video Killed the HD-Map: Predicting Driving Behavior Directly From Drone Images | Yunpeng Liu et.al. | 2305.11856v1 | null |
2023-05-19 | Multimodal Web Navigation with Instruction-Finetuned Foundation Models | Hiroki Furuta et.al. | 2305.11854v1 | null |
2023-05-19 | Poincare and Einstein on Mass-Energy Equivalence: A Modern Perspective on their 1900 and 1905 Papers | Patrick Moylan et.al. | 2305.11852v1 | null |
2023-05-19 | Any-to-Any Generation via Composable Diffusion | Zineng Tang et.al. | 2305.11846v1 | link |
2023-05-18 | Instruct2Act: Mapping Multi-modality Instructions to Robotic Actions with Large Language Model | Siyuan Huang et.al. | 2305.11176v1 | link |
2023-05-18 | VisionLLM: Large Language Model is also an Open-Ended Decoder for Vision-Centric Tasks | Wenhai Wang et.al. | 2305.11175v1 | link |
2023-05-18 | Going Denser with Open-Vocabulary Part Segmentation | Peize Sun et.al. | 2305.11173v1 | link |
2023-05-18 | ONE-PEACE: Exploring One General Representation Model Toward Unlimited Modalities | Peng Wang et.al. | 2305.11172v1 | link |
2023-05-18 | TrueTeacher: Learning Factual Consistency Evaluation with Large Language Models | Zorik Gekhman et.al. | 2305.11171v1 | link |
2023-05-18 | Efficient Prompting via Dynamic In-Context Learning | Wangchunshu Zhou et.al. | 2305.11170v1 | null |
2023-05-18 | Evidence of Meaning in Language Models Trained on Programs | Charles Jin et.al. | 2305.11169v1 | null |
2023-05-17 | FastComposer: Tuning-Free Multi-Subject Image Generation with Localized Attention | Guangxuan Xiao et.al. | 2305.10431v1 | link |
2023-05-17 | CLIP-GCD: Simple Language Guided Generalized Category Discovery | Rabah Ouldnoughi et.al. | 2305.10420v1 | null |
2023-05-17 | Towards Multi-Layered 3D Garments Animation | Yidi Shao et.al. | 2305.10418v1 | null |
2023-05-17 | Scratch Copilot Evaluation: Assessing AI-Assisted Creative Coding for Families | Stefania Druga et.al. | 2305.10417v1 | null |
2023-05-18 | PMC-VQA: Visual Instruction Tuning for Medical Visual Question Answering | Xiaoman Zhang et.al. | 2305.10415v2 | link |
2023-05-17 | AI Friends: A Design Framework for AI-Powered Creative Programming for Youth | Stefania Druga et.al. | 2305.10412v1 | null |
2023-05-17 | Data Extraction via Semantic Regular Expression Synthesis | Qiaochu Chen et.al. | 2305.10401v1 | null |
2023-05-16 | Understanding 3D Object Interaction from a Single Image | Shengyi Qian et.al. | 2305.09664v1 | link |
2023-05-16 | Make-An-Animation: Large-Scale Text-conditional 3D Human Motion Generation | Samaneh Azadi et.al. | 2305.09662v1 | null |
2023-05-16 | Double Pessimism is Provably Efficient for Distributionally Robust Offline Reinforcement Learning: Generic Algorithm and Robust Partial Coverage | Jose Blanchet et.al. | 2305.09659v1 | null |
2023-05-16 | Newad: A register map automation tool for Verilog | Vamsi K Vytla et.al. | 2305.09657v1 | null |
2023-05-17 | Satisfiability-Aided Language Models Using Declarative Prompting | Xi Ye et.al. | 2305.09656v2 | link |
2023-05-16 | Tailoring Instructions to Student’s Learning Levels Boosts Knowledge Distillation | Yuxin Ren et.al. | 2305.09651v1 | link |
2023-05-16 | Wavelet-based Unsupervised Label-to-Image Translation | George Eskandar et.al. | 2305.09647v1 | link |
2023-05-15 | Laughing Matters: Introducing Laughing-Face Generation using Diffusion Models | Antoni Bigata Casademunt et.al. | 2305.08854v1 | link |
2023-05-15 | CQE: A Comprehensive Quantity Extractor | Satya Almasian et.al. | 2305.08853v1 | link |
2023-05-15 | MV-Map: Offboard HD-Map Generation with Multi-view Consistency | Ziyang Xie et.al. | 2305.08851v1 | link |
2023-05-15 | Make-A-Protagonist: Generic Video Editing with An Ensemble of Experts | Yuyang Zhao et.al. | 2305.08850v1 | null |
2023-05-15 | Privacy Auditing with One (1) Training Run | Thomas Steinke et.al. | 2305.08846v1 | null |
2023-05-15 | Large Language Models are Zero-Shot Rankers for Recommender Systems | Yupeng Hou et.al. | 2305.08845v1 | link |
2023-05-15 | RL4F: Generating Natural Language Feedback with Reinforcement Learning for Repairing Model Outputs | Afra Feyza Akyürek et.al. | 2305.08844v1 | link |
2023-05-15 | Straightening Out the Straight-Through Estimator: Overcoming Optimization Challenges in Vector Quantized Networks | Minyoung Huh et.al. | 2305.08842v1 | null |
2023-05-15 | Attacking Perceptual Similarity Metrics | Abhijay Ghildyal et.al. | 2305.08840v1 | null |
2023-05-12 | Text2Cohort: Democratizing the NCI Imaging Data Commons with Natural Language Cohort Discovery | Pranav Kulkarni et.al. | 2305.07637v1 | link |
2023-05-12 | Development of MC/DC: a performant, scalable, and portable Python-based Monte Carlo neutron transport code | Ilham Variansyah et.al. | 2305.07636v1 | link |
2023-05-12 | Zero-shot Item-based Recommendation via Multi-task Product Knowledge Graph Pre-Training | Ziwei Fan et.al. | 2305.07633v1 | null |
2023-05-12 | Design, Development, and Evaluation of an Interactive Personalized Social Robot to Monitor and Coach Post-Stroke Rehabilitation Exercises | Min Hun Lee et.al. | 2305.07632v1 | null |
2023-05-11 | SparseGNV: Generating Novel Views of Indoor Scenes with Sparse Input Views | Weihao Cheng et.al. | 2305.07024v1 | link |
2023-05-11 | Simple Token-Level Confidence Improves Caption Correctness | Suzanne Petryk et.al. | 2305.07021v1 | null |
2023-05-11 | A General-Purpose Multilingual Document Encoder | Onur Galoğlu et.al. | 2305.07016v1 | link |
2023-05-11 | Exploiting Diffusion Prior for Real-World Image Super-Resolution | Jianyi Wang et.al. | 2305.07015v1 | link |
2023-05-11 | Occam’s razor for AI: Coarse-graining Hammett Inspired Product Ansatz in Chemical Space | Marco Bragato et.al. | 2305.07010v1 | null |
2023-05-11 | Fair Price Discrimination | Siddhartha Banerjee et.al. | 2305.07006v1 | null |
2023-05-11 | Subword Segmental Machine Translation: Unifying Segmentation and Target Sentence Generation | Francois Meyer et.al. | 2305.07005v1 | link |
2023-05-11 | Not All Languages Are Created Equal in LLMs: Improving Multilingual Capability by Cross-Lingual-Thought Prompting | Haoyang Huang et.al. | 2305.07004v1 | null |
2023-05-11 | Real-time Manipulation of Liquid Droplets using Photo-responsive Surfactant | Xichen Liang et.al. | 2305.07002v1 | null |
2023-05-10 | Generalizations and Extensions to Lifting Constructions for Coded Caching | V. R. Aravind et.al. | 2305.06352v1 | null |
2023-05-10 | RECKONING: Reasoning through Dynamic Knowledge Encoding | Zeming Chen et.al. | 2305.06349v1 | link |
2023-05-10 | Frequency-Supported Neural Networks for Nonlinear Dynamical System Identification | Krzysztof Zając et.al. | 2305.06344v1 | link |
2023-05-10 | Incorporating Structured Representations into Pretrained Vision & Language Models Using Scene Graphs | Roei Herzig et.al. | 2305.06343v1 | null |
2023-05-10 | Generalized Stratified Sampling for Efficient Reliability Assessment of Structures Against Natural Hazards | Srinivasan Arunachalam et.al. | 2305.06338v1 | null |
2023-05-10 | K-UniMorph: Korean Universal Morphology and its Feature Schema | Eunkyul Leah Jo et.al. | 2305.06335v1 | link |
2023-05-10 | Direct-Laser-Written Polymer Nanowire Waveguides for Broadband Single Photon Collection from Epitaxial Quantum Dots into a Gaussian-like Mode | Edgar Perez et.al. | 2305.06333v1 | null |
2023-05-09 | Policy Gradient Methods in the Presence of Symmetries and State Abstractions | Prakash Panangaden et.al. | 2305.05666v1 | link |
2023-05-09 | ImageBind: One Embedding Space To Bind Them All | Rohit Girdhar et.al. | 2305.05665v1 | link |
2023-05-10 | InternChat: Solving Vision-Centric Tasks by Interacting with Chatbots Beyond Language | Zhaoyang Liu et.al. | 2305.05662v2 | link |
2023-05-09 | TidyBot: Personalized Robot Assistance with Large Language Models | Jimmy Wu et.al. | 2305.05658v1 | link |
2023-05-09 | Using Knowledge Units of Programming Languages to Recommend Reviewers for Pull Requests: An Empirical Study | Md Ahasanuzzaman et.al. | 2305.05654v1 | null |
2023-05-09 | Asymmetric $X$-Secure $T$ -Private Information Retrieval: More Databases is Not Always Better | Mohamed Nomeir et.al. | 2305.05649v1 | null |
2023-05-08 | Learning to Evaluate the Artness of AI-generated Images | Junyu Chen et.al. | 2305.04923v1 | null |
2023-05-08 | DiffuseStyleGesture: Stylized Audio-Driven Co-Speech Gesture Generation with Diffusion Models | Sicheng Yang et.al. | 2305.04919v1 | link |
2023-05-08 | What Do Patients Say About Their Disease Symptoms? Deep Multilabel Text Classification With Human-in-the-Loop Curation for Automatic Labeling of Patient Self Reports of Problems | Lakshmi Arbatti et.al. | 2305.04905v1 | null |
2023-05-08 | Robust Positivity Problems for low-order Linear Recurrence Sequences | Mihir Vahanwala et.al. | 2305.04870v1 | null |
2023-05-05 | On the Benefits of Semi-Supervised Test Case Generation for Cyber-Physical Systems | Xiao Ling et.al. | 2305.03714v1 | null |
2023-05-05 | Avatar Fingerprinting for Authorized Use of Synthetic Talking-Head Videos | Ekta Prashnani et.al. | 2305.03713v1 | null |
2023-05-08 | On the characterization of the convective heat flux in turbulent Rayleigh-Bénard convection | Bérengère Podvin et.al. | 2305.03708v2 | null |
2023-05-05 | LMEye: An Interactive Perception Network for Large Language Models | Yunxin Li et.al. | 2305.03701v1 | link |
2023-05-05 | Vera: A General-Purpose Plausibility Estimation Model for Commonsense Statements | Jiacheng Liu et.al. | 2305.03695v1 | link |
2023-05-05 | Mining bias-target Alignment from Voronoi Cells | Rémi Nahon et.al. | 2305.03691v1 | link |
2023-05-05 | COLA: How to adapt vision-language models to Compose Objects Localized with Attributes? | Arijit Ray et.al. | 2305.03689v1 | link |
2023-05-04 | ZipIt! Merging Models from Different Tasks without Training | George Stoica et.al. | 2305.03053v1 | link |
2023-05-04 | Controllable Visual-Tactile Synthesis | Ruihan Gao et.al. | 2305.03051v1 | link |
2023-05-04 | NeuralEditor: Editing Neural Radiance Fields via Manipulating Point Clouds | Jun-Kun Chen et.al. | 2305.03049v1 | null |
2023-05-04 | Personalize Segment Anything Model with One Shot | Renrui Zhang et.al. | 2305.03048v1 | link |
2023-05-04 | Principle-Driven Self-Alignment of Language Models from Scratch with Minimal Human Supervision | Zhiqing Sun et.al. | 2305.03047v1 | link |
2023-05-04 | OctFormer: Octree-based Transformers for 3D Point Clouds | Peng-Shuai Wang et.al. | 2305.03045v1 | link |
2023-05-04 | Single-Shot Implicit Morphable Faces with Consistent Texture Parameterization | Connor Z. Lin et.al. | 2305.03043v1 | null |
2023-05-04 | Are VAEs Bad at Reconstructing Molecular Graphs? | Hagen Muenkler et.al. | 2305.03041v1 | null |
2023-05-04 | TUVF: Learning Generalizable Texture UV Radiance Fields | An-Chieh Cheng et.al. | 2305.03040v1 | null |
2023-05-03 | Characterizing Political Bias in Automatic Summaries: A Case Study of Trump and Biden | Karen Zhou et.al. | 2305.02321v1 | link |
2023-05-03 | Generating Synthetic Documents for Cross-Encoder Re-Rankers: A Comparative Study of ChatGPT and Human Experts | Arian Askari et.al. | 2305.02320v1 | link |
2023-05-03 | Visual Chain of Thought: Bridging Logical Gaps with Multimodal Infillings | Daniel Rose et.al. | 2305.02317v1 | null |
2023-05-03 | AG3D: Learning to Generate 3D Avatars from 2D Image Collections | Zijian Dong et.al. | 2305.02312v1 | null |
2023-05-03 | Real-Time Radiance Fields for Single-Image Portrait View Synthesis | Alex Trevithick et.al. | 2305.02310v1 | null |
2023-05-03 | Calibrated Explanations: with Uncertainty Information and Counterfactuals | Helena Lofstrom et.al. | 2305.02305v1 | link |
2023-05-02 | Humans as Light Bulbs: 3D Human Reconstruction from Thermal Reflection | Ruoshi Liu et.al. | 2305.01652v1 | null |
2023-05-02 | Generalizing Dataset Distillation via Deep Generative Prior | George Cazenavette et.al. | 2305.01649v1 | link |
2023-05-02 | Sequence Modeling with Multiresolution Convolutional Memory | Jiaxin Shi et.al. | 2305.01638v1 | link |
2023-05-02 | The Benefits of Bad Advice: Autocontrastive Decoding across Model Layers | Ariel Gera et.al. | 2305.01628v1 | link |
2023-05-02 | Basic syntax from speech: Spontaneous concatenation in unsupervised deep neural networks | Gašper Beguš et.al. | 2305.01626v1 | null |
2023-05-02 | TMR: Text-to-Motion Retrieval Using Contrastive 3D Human Motion Synthesis | Mathis Petrovich et.al. | 2305.00976v1 | null |
2023-05-01 | ArK: Augmented Reality with Knowledge Interactive Emergent Ability | Qiuyuan Huang et.al. | 2305.00970v1 | null |
2023-05-01 | PMDG: Privacy for Multi-Perspective Process Mining through Data Generalization | Ryan Hildebrant et.al. | 2305.00960v1 | null |
2023-05-01 | Non-Binary LDPC Code Design for Energy-Time Entanglement Quantum Key Distribution | Debarnab Mitra et.al. | 2305.00956v1 | null |
2023-05-01 | Bridging the Gap: A Survey on Integrating (Human) Feedback for Natural Language Generation | Patrick Fernandes et.al. | 2305.00955v1 | null |
2023-04-28 | LLaMA-Adapter V2: Parameter-Efficient Visual Instruction Model | Peng Gao et.al. | 2304.15010v1 | link |
2023-04-28 | Empirical Analysis of the Strengths and Weaknesses of PEFT Techniques for LLMs | George Pu et.al. | 2304.14999v1 | null |
2023-04-28 | ChatGPT – a Blessing or a Curse for Undergraduate Computer Science Students and Instructors? | Ishika Joshi et.al. | 2304.14993v1 | null |
2023-04-28 | Robust Stackelberg Equilibria | Jiarui Gan et.al. | 2304.14990v1 | null |
2023-04-28 | Interpreting Vision and Language Generative Models with Semantic Visual Priors | Michele Cafagna et.al. | 2304.14986v1 | null |
2023-04-28 | Optimal majority rules and quantitative Condorcet properties of setwise Kemeny voting schemes | Xuan Kien Phung et.al. | 2304.14980v1 | null |
2023-04-28 | MLCopilot: Unleashing the Power of Large Language Models in Solving Machine Learning Tasks | Lei Zhang et.al. | 2304.14979v1 | link |
2023-04-27 | ChatVideo: A Tracklet-centric Multimodal and Versatile Video Understanding System | Junke Wang et.al. | 2304.14407v1 | null |
2023-04-27 | Motion-Conditioned Diffusion Model for Controllable Video Synthesis | Tsai-Shien Chen et.al. | 2304.14404v1 | null |
2023-04-27 | LaMini-LM: A Diverse Herd of Distilled Models from Large-Scale Instructions | Minghao Wu et.al. | 2304.14402v1 | link |
2023-04-27 | ActorsNeRF: Animatable Few-shot Human Rendering with Generalizable NeRFs | Jiteng Mu et.al. | 2304.14401v1 | null |
2023-04-27 | IconShop: Text-Based Vector Icon Synthesis with Autoregressive Transformers | Ronghuan Wu et.al. | 2304.14400v1 | null |
2023-04-27 | We’re Afraid Language Models Aren’t Modeling Ambiguity | Alisa Liu et.al. | 2304.14399v1 | link |
2023-04-27 | Maximizing Model Generalization for Manufacturing with Self-Supervised Learning and Federated Learning | Matthew Russell et.al. | 2304.14398v1 | null |
2023-04-27 | Learning Articulated Shape with Keypoint Pseudo-labels from Web Images | Anastasis Stathopoulos et.al. | 2304.14396v1 | null |
2023-04-27 | SeqTrack: Sequence to Sequence Learning for Visual Object Tracking | Xin Chen et.al. | 2304.14394v1 | link |
2023-04-26 | Controllable Image Generation via Collage Representations | Arantxa Casanova et.al. | 2304.13722v1 | null |
2023-04-26 | Evaluation of GPT-3.5 and GPT-4 for supporting real-world information needs in healthcare delivery | Debadutta Dash et.al. | 2304.13714v1 | null |
2023-04-27 | Harnessing the Power of LLMs in Practice: A Survey on ChatGPT and Beyond | Jingfeng Yang et.al. | 2304.13712v2 | link |
2023-04-26 | UniNeXt: Exploring A Unified Architecture for Vision Recognition | Fangjian Lin et.al. | 2304.13700v1 | link |
2023-04-26 | Hitting Subgraphs in Sparse Graphs and Geometric Intersection Graphs | Daniel Lokshtanov et.al. | 2304.13695v1 | null |
2023-04-26 | HeySQuAD: A Spoken Question Answering Dataset | Yijing Wu et.al. | 2304.13689v1 | link |
2023-04-25 | DQS3D: Densely-matched Quantization-aware Semi-supervised 3D Detection | Huan-ang Gao et.al. | 2304.13031v1 | link |
2023-04-25 | On the mechanism of polaritonic rate suppression from quantum transition paths | Michelle C. Anderson et.al. | 2304.13024v1 | null |
2023-04-25 | Seeing is not always believing: A Quantitative Study on Human Perception of AI-Generated Images | Zeyu Lu et.al. | 2304.13023v1 | link |
2023-04-25 | Certifying Ensembles: A General Certification Theory with S-Lipschitzness | Aleksandar Petrov et.al. | 2304.13019v1 | null |
2023-04-25 | Bibliometric Data Fusion for Biomedical Information Retrieval | Timo Breuer et.al. | 2304.13012v1 | null |
2023-04-25 | The Potential of Visual ChatGPT For Remote Sensing | Lucas Prado Osco et.al. | 2304.13009v1 | null |
2023-04-25 | Answering Questions by Meta-Reasoning over Multiple Chains of Thought | Ori Yoran et.al. | 2304.13007v1 | link |
2023-04-24 | Explicit Correspondence Matching for Generalizable Neural Radiance Fields | Yuedong Chen et.al. | 2304.12294v1 | link |
2023-04-24 | Synthpop++: A Hybrid Framework for Generating A Country-scale Synthetic Population | Bhavesh Neekhra et.al. | 2304.12284v1 | link |
2023-04-21 | Deep-Learning-based Fast and Accurate 3D CT Deformable Image Registration in Lung Cancer | Yuzhen Ding et.al. | 2304.11135v1 | null |
2023-04-20 | Learning Sparse and Low-Rank Priors for Image Recovery via Iterative Reweighted Least Squares Minimization | Stamatios Lefkimmiatis et.al. | 2304.10536v1 | null |
2023-04-20 | Farm3D: Learning Articulated 3D Animals by Distilling 2D Diffusion | Tomas Jakab et.al. | 2304.10535v1 | null |
2023-04-20 | Collaborative Diffusion for Multi-Modal Face Generation and Editing | Ziqi Huang et.al. | 2304.10530v1 | link |
2023-04-20 | Generalizing Neural Human Fitting to Unseen Poses With Articulated SE(3) Equivariance | Haiwen Feng et.al. | 2304.10528v1 | null |
2023-04-20 | Multidimensional Uncertainty Quantification for Deep Neural Networks | Xujiang Zhao et.al. | 2304.10527v1 | null |
2023-04-20 | GenCorres: Consistent Shape Matching via Coupled Implicit-Explicit Shape Generative Models | Haitao Yang et.al. | 2304.10523v1 | link |
2023-04-20 | Contrastive Tuning: A Little Help to Make Masked Autoencoders Forget | Johannes Lehner et.al. | 2304.10520v1 | link |
2023-04-19 | LipsFormer: Introducing Lipschitz Continuity to Vision Transformers | Xianbiao Qi et.al. | 2304.09856v1 | link |
2023-04-19 | Bridging RL Theory and Practice with the Effective Horizon | Cassidy Laidlaw et.al. | 2304.09853v1 | link |
2023-04-19 | Evaluating Verifiability in Generative Search Engines | Nelson F. Liu et.al. | 2304.09848v1 | link |
2023-04-19 | Chameleon: Plug-and-Play Compositional Reasoning with Large Language Models | Pan Lu et.al. | 2304.09842v1 | link |
2023-04-19 | Points of non-linearity of functions generated by random neural networks | David Holmes et.al. | 2304.09837v1 | null |
2023-04-18 | Optimal PAC Bounds Without Uniform Convergence | Ishaq Aden-Ali et.al. | 2304.09167v1 | null |
2023-04-18 | Exploring the Trade-Offs: Unified Large Language Models vs Local Fine-Tuned Models for Highly-Specific Radiology NLI Task | Zihao Wu et.al. | 2304.09138v1 | null |
2023-04-17 | Conditional Generation of Audio from Video via Foley Analogies | Yuexi Du et.al. | 2304.08490v1 | link |
2023-04-17 | Hyper-Decision Transformer for Efficient Online Policy Adaptation | Mengdi Xu et.al. | 2304.08487v1 | null |
2023-04-17 | Visual Instruction Tuning | Haotian Liu et.al. | 2304.08485v1 | link |
2023-04-17 | Text2Performer: Text-Driven Human Video Generation | Yuming Jiang et.al. | 2304.08483v1 | link |
2023-04-17 | Towards Robust Prompts on Vision-Language Models | Jindong Gu et.al. | 2304.08479v1 | null |
2023-04-18 | Latent-Shift: Latent Diffusion with Temporal Shift for Efficient Text-to-Video Generation | Jie An et.al. | 2304.08477v2 | null |
2023-04-14 | Cross-Entropy Loss Functions: Theoretical Analysis and Applications | Anqi Mao et.al. | 2304.07288v1 | null |
2023-04-14 | Solving Unique Games over Globally Hypercontractive Graphs | Mitali Bafna et.al. | 2304.07284v1 | null |
2023-04-14 | Synthetically Generating Human-like Data for Sequential Decision Making Tasks via Reward-Shaped Imitation Learning | Bryan Brandt et.al. | 2304.07280v1 | null |
2023-04-17 | Identifying Cluttering Edges in Near-Planar Graphs | Simon van Wageningen et.al. | 2304.07274v2 | link |
2023-04-13 | Expressive Text-to-Image Generation with Rich Text | Songwei Ge et.al. | 2304.06720v1 | null |
2023-04-13 | Single-Stage Diffusion NeRF: A Unified Approach to 3D Generation and Reconstruction | Hansheng Chen et.al. | 2304.06714v1 | link |
2023-04-13 | What does CLIP know about a red circle? Visual prompt engineering for VLMs | Aleksandar Shtedritski et.al. | 2304.06712v1 | null |
2023-04-13 | DiffusionRig: Learning Personalized Priors for Facial Appearance Editing | Zheng Ding et.al. | 2304.06711v1 | link |
2023-04-13 | How Will It Drape Like? Capturing Fabric Mechanics from Depth Images | Carlos Rodriguez-Pardo et.al. | 2304.06704v1 | null |
2023-04-13 | Learning Controllable 3D Diffusion Models from Single-view Images | Jiatao Gu et.al. | 2304.06700v1 | null |
2023-04-13 | Improving novelty detection with generative adversarial networks on hand gesture data | Miguel Simão et.al. | 2304.06696v1 | null |
2023-04-12 | Continual Diffusion: Continual Customization of Text-to-Image Diffusion with C-LoRA | James Seale Smith et.al. | 2304.06027v1 | null |
2023-04-12 | DreamPose: Fashion Image-to-Video Synthesis via Stable Diffusion | Johanna Karras et.al. | 2304.06025v1 | null |
2023-04-12 | Probabilistic Human Mesh Recovery in 3D Scenes from Egocentric Views | Siwei Zhang et.al. | 2304.06024v1 | link |
2023-04-12 | SAM Struggles in Concealed Scenes – Empirical Study on “Segment Anything” | Ge-Peng Ji et.al. | 2304.06022v1 | null |
2023-04-12 | Crowd Counting with Sparse Annotation | Shiwei Zhang et.al. | 2304.06021v1 | null |
2023-04-12 | VidStyleODE: Disentangled Video Editing via StyleGAN and NeuralODEs | Moayed Haji Ali et.al. | 2304.06020v1 | null |
2023-04-12 | Generating Aligned Pseudo-Supervision from Non-Aligned Data for Image Restoration in Under-Display Camera | Ruicheng Feng et.al. | 2304.06019v1 | link |
2023-04-12 | Bi-level Latent Variable Model for Sample-Efficient Multi-Agent Reinforcement Learning | Aravind Venugopal et.al. | 2304.06011v1 | null |
2023-04-11 | HRS-Bench: Holistic, Reliable and Scalable Benchmark for Text-to-Image Models | Eslam Mohamed Bakr et.al. | 2304.05390v1 | link |
2023-04-11 | Human-AI Co-Creation Approach to Find Forever Chemicals Replacements | Juliana Jansen Ferreira et.al. | 2304.05389v1 | null |
2023-04-11 | MOST: Multiple Object localization with Self-supervised Transformers for object discovery | Sai Saketh Rambhatla et.al. | 2304.05387v1 | null |
2023-04-11 | Bloom filters for molecules | Jorge Medina et.al. | 2304.05386v1 | link |
2023-04-10 | A Cheaper and Better Diffusion Language Model with Soft-Masked Noise | Jiaao Chen et.al. | 2304.04746v1 | link |
2023-04-10 | Ambiguous Medical Image Segmentation using Diffusion Models | Aimon Rahman et.al. | 2304.04745v1 | link |
2023-04-10 | On the Possibilities of AI-Generated Text Detection | Souradip Chakraborty et.al. | 2304.04736v1 | null |
2023-04-07 | Embodied Concept Learner: Self-supervised Learning of Concepts and Mapping through Instruction Following | Mingyu Ding et.al. | 2304.03767v1 | null |
2023-04-07 | Language Models are Causal Knowledge Extractors for Zero-shot Video Question Answering | Hung-Ting Su et.al. | 2304.03754v1 | null |
2023-04-07 | V3Det: Vast Vocabulary Visual Detection Dataset | Jiaqi Wang et.al. | 2304.03752v1 | null |
2023-04-07 | Perspectives on AI Architectures and Co-design for Earth System Predictability | Maruti K. Mudunuru et.al. | 2304.03748v1 | null |
2023-04-07 | Assessing Perceived Fairness from Machine Learning Developer’s Perspective | Anoop Mishra et.al. | 2304.03745v1 | null |
2023-04-06 | Diffusion Models as Masked Autoencoders | Chen Wei et.al. | 2304.03283v1 | null |
2023-04-06 | Do the Rewards Justify the Means? Measuring Trade-Offs Between Rewards and Ethical Behavior in the MACHIAVELLI Benchmark | Alexander Pan et.al. | 2304.03279v1 | link |
2023-04-06 | How Do US Congress Members Advertise Climate Change: An Analysis Of Ads Run On Meta’s Platforms | Laurenz Aisenpreis et.al. | 2304.03278v1 | null |
2023-04-06 | Instruction Tuning with GPT-4 | Baolin Peng et.al. | 2304.03277v1 | link |
2023-04-06 | That’s What I Said: Fully-Controllable Talking Face Generation | Youngjoon Jang et.al. | 2304.03275v1 | null |
2023-04-06 | Towards self-driving laboratories in chemistry and materials sciences: The central role of DFT in the era of AI | Bing Huang et.al. | 2304.03272v1 | null |
2023-04-06 | Causal Discovery with Score Matching on Additive Models with Arbitrary Noise | Francesco Montagna et.al. | 2304.03265v1 | null |
2023-04-05 | Taming Encoder for Zero Fine-tuning Image Customization with Text-to-Image Diffusion Models | Xuhui Jia et.al. | 2304.02642v1 | null |
2023-04-05 | ENTL: Embodied Navigation Trajectory Learner | Klemen Kotar et.al. | 2304.02639v1 | null |
2023-04-05 | GenPhys: From Physical Processes to Generative Models | Ziming Liu et.al. | 2304.02637v1 | null |
2023-04-05 | HNeRV: A Hybrid Neural Representation for Videos | Hao Chen et.al. | 2304.02633v1 | link |
2023-04-05 | Towards Explainable AI Writing Assistants for Non-native English Speakers | Yewon Kim et.al. | 2304.02625v1 | null |
2023-04-05 | High-fidelity Pseudo-labels for Boosting Weakly-Supervised Segmentation | Arvi Jonnarth et.al. | 2304.02621v1 | link |
2023-04-04 | Large Language Models are Edge-Case Fuzzers: Testing Deep Learning Libraries via FuzzGPT | Yinlin Deng et.al. | 2304.02014v1 | null |
2023-04-04 | NPC: Neural Point Characters from Video | Shih-Yang Su et.al. | 2304.02013v1 | null |
2023-04-04 | EGC: Image Generation and Classification via a Single Energy-Based Model | Qiushan Guo et.al. | 2304.02012v1 | link |
2023-04-04 | FakET: Simulating Cryo-Electron Tomograms with Neural Style Transfer | Pavol Harar et.al. | 2304.02011v1 | link |
2023-04-04 | OrienterNet: Visual Localization in 2D Public Maps with Neural Matching | Paul-Edouard Sarlin et.al. | 2304.02009v1 | null |
2023-04-04 | MonoHuman: Animatable Human Neural Field from Monocular Video | Zhengming Yu et.al. | 2304.02001v1 | null |
2023-04-04 | Revisiting the Evaluation of Image Synthesis with GANs | Mengping Yang et.al. | 2304.01999v1 | link |
2023-04-03 | Video Instance Segmentation in an Open-World | Omkar Thawakar et.al. | 2304.01200v1 | link |
2023-04-03 | Zero-Shot Semantic Segmentation with Decoupled One-Pass Network | Cong Han et.al. | 2304.01198v1 | link |
2023-04-03 | Bringing Telepresence to Every Desk | Shengze Wang et.al. | 2304.01197v1 | null |
2023-04-04 | Baize: An Open-Source Chat Model with Parameter-Efficient Tuning on Self-Chat Data | Canwen Xu et.al. | 2304.01196v2 | link |
2023-04-03 | Burstormer: Burst Image Restoration and Enhancement Transformer | Akshay Dudhane et.al. | 2304.01194v1 | link |
2023-04-03 | Follow Your Pose: Pose-Guided Text-to-Video Generation using Pose-Free Videos | Yue Ma et.al. | 2304.01186v1 | link |
2023-04-03 | Whistler Wave Observations by \textit{Parker Solar Probe} During Encounter $1$ : Counter-Propagating Whistlers Collocated with Magnetic Field Inhomogeneities and their Application to Electric Field Measurement Calibration | S. Karbashewski et.al. | 2304.01185v1 | null |
2023-03-31 | Towards Flexible Multi-modal Document Models | Naoto Inoue et.al. | 2303.18248v1 | link |
2023-03-31 | Speeding up Madgraph5 aMC@NLO through CPU vectorization and GPU offloading: towards a first alpha release | Andrea Valassi et.al. | 2303.18244v1 | null |
2023-03-31 | $\infty$ -Diff: Infinite Resolution Diffusion with Subsampled Mollified States | Sam Bond-Taylor et.al. | 2303.18242v1 | link |
2023-03-31 | Procedure-Aware Pretraining for Instructional Video Understanding | Honglu Zhou et.al. | 2303.18230v1 | link |
2023-03-31 | A Survey of Large Language Models | Wayne Xin Zhao et.al. | 2303.18223v1 | link |
2023-03-31 | SemHint-MD: Learning from Noisy Semantic Labels for Self-Supervised Monocular Depth Estimation | Shan Lin et.al. | 2303.18219v1 | null |
2023-03-31 | A Closer Look at Few-Shot 3D Point Cloud Classification | Chuangguan Ye et.al. | 2303.18210v1 | link |
2023-03-30 | AvatarCraft: Transforming Text into Neural Human Avatars with Parameterized Shape and Pose Control | Ruixiang Jiang et.al. | 2303.17606v1 | link |
2023-03-30 | Token Merging for Fast Stable Diffusion | Daniel Bolya et.al. | 2303.17604v1 | link |
2023-03-30 | NeRF-Supervised Deep Stereo | Fabio Tosi et.al. | 2303.17603v1 | link |
2023-03-30 | Beyond Appearance: a Semantic Controllable Self-Supervised Learning Framework for Human-Centric Visual Tasks | Weihua Chen et.al. | 2303.17602v1 | link |
2023-03-30 | When Learning Is Out of Reach, Reset: Generalization in Autonomous Visuomotor Reinforcement Learning | Zichen Zhang et.al. | 2303.17600v1 | null |
2023-03-30 | Zero-Shot Video Editing Using Off-The-Shelf Image Diffusion Models | Wen Wang et.al. | 2303.17599v1 | link |
2023-03-30 | Consistent View Synthesis with Pose-Guided Diffusion Models | Hung-Yu Tseng et.al. | 2303.17598v1 | null |
2023-03-30 | MobileInst: Video Instance Segmentation on the Mobile | Renhong Zhang et.al. | 2303.17594v1 | null |
2023-03-29 | AutoAD: Movie Description in Context | Tengda Han et.al. | 2303.16899v1 | link |
2023-03-29 | Bagging by Learning to Singulate Layers Using Interactive Perception | Lawrence Yunliang Chen et.al. | 2303.16898v1 | null |
2023-03-29 | Physics-Driven Diffusion Models for Impact Sound Synthesis from Videos | Kun Su et.al. | 2303.16897v1 | null |
2023-03-29 | Multi-scale Hierarchical Vision Transformer with Cascaded Attention Decoding for Medical Image Segmentation | Md Mostafijur Rahman et.al. | 2303.16892v1 | link |
2023-03-29 | Mask-free OVIS: Open-Vocabulary Instance Segmentation without Manual Mask Annotations | Vibashan VS et.al. | 2303.16891v1 | null |
2023-03-29 | DPF: Learning Dense Prediction Fields with Weak Supervision | Xiaoxue Chen et.al. | 2303.16890v1 | link |
2023-03-29 | Towards Understanding the Effect of Pretraining Label Granularity | Guan Zhe Hong et.al. | 2303.16887v1 | null |
2023-03-29 | End-to-End $n$ -ary Relation Extraction for Combination Drug Therapies | Yuhang Jiang et.al. | 2303.16886v1 | link |
2023-03-29 | Instant Neural Radiance Fields Stylization | Shaoxu Li et.al. | 2303.16884v1 | link |
2023-03-29 | Your Diffusion Model is Secretly a Zero-Shot Classifier | Alexander C. Li et.al. | 2303.16203v2 | link |
2023-03-28 | LLaMA-Adapter: Efficient Fine-tuning of Language Models with Zero-init Attention | Renrui Zhang et.al. | 2303.16199v1 | link |
2023-03-28 | BC-IRL: Learning Generalizable Reward Functions from Demonstrations | Andrew Szot et.al. | 2303.16194v1 | null |
2023-03-28 | Planning with Sequence Models through Iterative Energy Minimization | Hongyi Chen et.al. | 2303.16189v1 | null |
2023-03-28 | Visual Chain-of-Thought Diffusion Models | William Harvey et.al. | 2303.16187v1 | link |
2023-03-28 | Label Smoothing Improves Neural Source Code Summarization | Sakib Haque et.al. | 2303.16178v1 | null |
2023-03-27 | IRFL: Image Recognition of Figurative Language | Ron Yosef et.al. | 2303.15445v1 | link |
2023-03-27 | Zero-shot Model Diagnosis | Jinqi Luo et.al. | 2303.15441v1 | null |
2023-03-27 | FaceLit: Neural 3D Relightable Faces | Anurag Ranjan et.al. | 2303.15437v1 | null |
2023-03-27 | The Stable Signature: Rooting Watermarks in Latent Diffusion Models | Pierre Fernandez et.al. | 2303.15435v1 | link |
2023-03-27 | Anti-DreamBooth: Protecting users from personalized text-to-image synthesis | Thanh Van Le et.al. | 2303.15433v1 | link |
2023-03-27 | TextMI: Textualize Multimodal Information for Integrating Non-verbal Cues in Pre-trained Language Models | Md Kamrul Hasan et.al. | 2303.15430v1 | null |
2023-03-27 | JAWS: Just A Wild Shot for Cinematic Transfer in Neural Radiance Fields | Xi Wang et.al. | 2303.15427v1 | link |
2023-03-24 | Masked Scene Contrast: A Scalable Framework for Unsupervised 3D Representation Learning | Xiaoyang Wu et.al. | 2303.14191v1 | link |
2023-03-24 | Learning from Few Demonstrations with Frame-Weighted Motion Generation | Jianyong Sun et.al. | 2303.14188v1 | null |
2023-03-24 | Make-It-3D: High-Fidelity 3D Creation from A Single Image with Diffusion Prior | Junshu Tang et.al. | 2303.14184v1 | link |
2023-03-24 | Scaling Expert Language Models with Unsupervised Domain Discovery | Suchin Gururangan et.al. | 2303.14177v1 | link |
2023-03-24 | A Hybrid ANN-SNN Architecture for Low-Power and Low-Latency Visual Perception | Asude Aydin et.al. | 2303.14176v1 | null |
2023-03-24 | UrbanGIRAFFE: Representing Urban Scenes as Compositional Generative Neural Feature Fields | Yuanbo Yang et.al. | 2303.14167v1 | null |
2023-03-23 | Ablating Concepts in Text-to-Image Diffusion Models | Nupur Kumari et.al. | 2303.13516v1 | link |
2023-03-23 | Persistent Nature: A Generative Model of Unbounded 3D Worlds | Lucy Chai et.al. | 2303.13515v1 | link |
2023-03-23 | DreamBooth3D: Subject-Driven Text-to-3D Generation | Amit Raj et.al. | 2303.13508v1 | null |
2023-03-23 | A Large-scale Study of Spatiotemporal Representation Learning with a New Benchmark on Action Recognition | Andong Deng et.al. | 2303.13505v1 | link |
2023-03-23 | Chordal Averaging on Flag Manifolds and Its Applications | Nathan Mankovich et.al. | 2303.13501v1 | link |
2023-03-23 | A Closer Look at Model Adaptation using Feature Distortion and Simplicity Bias | Puja Trivedi et.al. | 2303.13500v1 | null |
2023-03-23 | TriPlaneNet: An Encoder for EG3D Inversion | Ananta R. Bhattarai et.al. | 2303.13497v1 | null |
2023-03-22 | Diffuse-Denoise-Count: Accurate Crowd-Counting with Diffusion Models | Yasiru Ranasinghe et.al. | 2303.12790v1 | link |
2023-03-22 | EPro-PnP: Generalized End-to-End Probabilistic Perspective-n-Points for Monocular Object Pose Estimation | Hansheng Chen et.al. | 2303.12787v1 | link |
2023-03-22 | Localization-based OFDM framework for RIS-aided systems | Fabio Saggese et.al. | 2303.12763v1 | link |
2023-03-22 | MaskCon: Masked Contrastive Learning for Coarse-Labelled Dataset | Chen Feng et.al. | 2303.12756v1 | link |
2023-03-22 | Invariants for time-dependent Hamiltonian systems | Jürgen Struckmeier et.al. | 2303.12746v1 | null |
2023-03-22 | Comment on the elastica section in Thorne and Blandford “Modern Classical Physics”, the shape of things, and the aspect ratio of reality | J. A. Hanna et.al. | 2303.12729v1 | null |
2023-03-21 | Natural Language-Assisted Sign Language Recognition | Ronglai Zuo et.al. | 2303.12080v1 | link |
2023-03-21 | Two-shot Video Object Segmentation | Kun Yan et.al. | 2303.12078v1 | link |
2023-03-21 | CC3D: Layout-Conditioned Generation of Compositional 3D Scenes | Sherwin Bahmani et.al. | 2303.12074v1 | null |
2023-03-21 | ProphNet: Efficient Agent-Centric Motion Forecasting with Anchor-Informed Proposals | Xishun Wang et.al. | 2303.12071v1 | null |
2023-03-21 | Machine Learning for Brain Disorders: Transformers and Visual Transformers | Robin Courant et.al. | 2303.12068v1 | null |
2023-03-20 | EVA-02: A Visual Representation for Neon Genesis | Yuxin Fang et.al. | 2303.11331v1 | link |
2023-03-20 | Sound Localization from Motion: Jointly Learning Sound Direction and Camera Rotation | Ziyang Chen et.al. | 2303.11329v1 | link |
2023-03-20 | Zero-1-to-3: Zero-shot One Image to 3D Object | Ruoshi Liu et.al. | 2303.11328v1 | link |
2023-03-20 | Open-vocabulary Panoptic Segmentation with Embedding Modulation | Xi Chen et.al. | 2303.11324v1 | null |
2023-03-20 | ScribbleSeg: Scribble-based Interactive Image Segmentation | Xi Chen et.al. | 2303.11320v1 | null |
2023-03-20 | Generative Semantic Segmentation | Jiaqi Chen et.al. | 2303.11316v1 | link |
2023-03-20 | waywiser: Ergonomic Methods for Assessing Spatial Models | Michael J Mahoney et.al. | 2303.11312v1 | link |
2023-03-17 | Data-centric Artificial Intelligence: A Survey | Daochen Zha et.al. | 2303.10158v1 | link |
2023-03-17 | CoVIO: Online Continual Learning for Visual-Inertial Odometry | Niclas Vödisch et.al. | 2303.10149v1 | link |
2023-03-17 | CoDEPS: Online Continual Learning for Depth Estimation and Panoptic Segmentation | Niclas Vödisch et.al. | 2303.10147v1 | link |
2023-03-17 | Dynamic Update-to-Data Ratio: Minimizing World Model Overfitting | Nicolai Dorka et.al. | 2303.10144v1 | link |
2023-03-16 | Efficient Diffusion Training via Min-SNR Weighting Strategy | Tiankai Hang et.al. | 2303.09556v1 | link |
2023-03-16 | PartNeRF: Generating Part-Aware Editable 3D Shapes without 3D Supervision | Konstantinos Tertikas et.al. | 2303.09554v1 | null |
2023-03-16 | SurroundOcc: Multi-Camera 3D Occupancy Prediction for Autonomous Driving | Yi Wei et.al. | 2303.09551v1 | link |
2023-03-16 | Diffusion-HPC: Generating Synthetic Images with Realistic Humans | Zhenzhen Weng et.al. | 2303.09541v1 | link |
2023-03-16 | Deep Metric Learning for Unsupervised Remote Sensing Change Detection | Wele Gedara Chaminda Bandara et.al. | 2303.09536v1 | link |
2023-03-17 | FateZero: Fusing Attentions for Zero-shot Text-based Video Editing | Chenyang Qi et.al. | 2303.09535v2 | link |
2023-03-16 | Tackling Clutter in Radar Data – Label Generation and Detection Using PointNet++ | Johannes Kopp et.al. | 2303.09530v1 | link |
2023-03-15 | Borda Regret Minimization for Generalized Linear Dueling Bandits | Yue Wu et.al. | 2303.08816v1 | null |
2023-03-15 | BiFormer: Vision Transformer with Bi-Level Routing Attention | Lei Zhu et.al. | 2303.08810v1 | link |
2023-03-15 | Stochastic Interpolants: A Unifying Framework for Flows and Diffusions | Michael S. Albergo et.al. | 2303.08797v1 | null |
2023-03-15 | PLEX: Making the Most of the Available Data for Robotic Manipulation Pretraining | Garrett Thomas et.al. | 2303.08789v1 | null |
2023-03-14 | Diversity-Aware Meta Visual Prompting | Qidong Huang et.al. | 2303.08138v1 | link |
2023-03-14 | LayoutDM: Discrete Diffusion Model for Controllable Layout Generation | Naoto Inoue et.al. | 2303.08137v1 | link |
2023-03-15 | Manipulate by Seeing: Creating Manipulation Controllers from Pre-Trained Representations | Jianren Wang et.al. | 2303.08135v2 | null |
2023-03-14 | MeshDiffusion: Score-based Generative 3D Mesh Modeling | Zhen Liu et.al. | 2303.08133v1 | link |
2023-03-15 | A Simple Framework for Open-Vocabulary Segmentation and Detection | Hao Zhang et.al. | 2303.08131v2 | link |
2023-03-14 | ViperGPT: Visual Inference via Python Execution for Reasoning | Dídac Surís et.al. | 2303.08128v1 | link |
2023-03-14 | Blind Video Deflickering by Neural Filtering with a Flawed Atlas | Chenyang Lei et.al. | 2303.08120v1 | link |
2023-03-14 | Parameterised Approximation of the Fixation Probability of the Dominant Mutation in the Multi-Type Moran Process | Leslie Ann Goldberg et.al. | 2303.08118v1 | null |
2023-03-13 | Revisiting Class-Incremental Learning with Pre-Trained Models: Generalizability and Adaptivity are All You Need | Da-Wei Zhou et.al. | 2303.07338v1 | link |
2023-03-13 | Lite DETR : An Interleaved Multi-Scale Encoder for Efficient DETR | Feng Li et.al. | 2303.07335v1 | link |
2023-03-13 | A Smoothing Algorithm for Minimum Sensing Path Plans in Gaussian Belief Space | Ali Reza Pedram et.al. | 2303.07326v1 | null |
2023-03-13 | Collision Cross-entropy and EM Algorithm for Self-labeled Classification | Zhongwen Zhang et.al. | 2303.07321v1 | null |
2023-03-13 | Linear regularized 13-moment equations with Onsager boundary conditions for general gas molecules | Zhenning Cai et.al. | 2303.07314v1 | null |
2023-03-13 | An efficient phase-field model of shear fractures using deviatoric stress split | Ehsan Haghighat et.al. | 2303.07309v1 | link |
2023-03-10 | Multiple Hands Make Light Work: Enhancing Quality and Diversity using MAP-Elites with Multiple Parallel Evolution Strategies | Manon Flageat et.al. | 2303.06137v1 | null |
2023-03-10 | Rewarding Chatbots for Real-World Engagement with Millions of Users | Robert Irvine et.al. | 2303.06135v1 | null |
2023-03-10 | Imaging the crustal and upper mantle structure of the North Anatolian Fault: A Transmission Matrix Framework for Local Adaptive Focusing | Rita Touma et.al. | 2303.06123v1 | null |
2023-03-10 | Ignorance is Bliss: Robust Control via Information Gating | Manan Tomar et.al. | 2303.06121v1 | null |
2023-03-11 | Wave-function parametrization of a probability measure | Leonardo Pedro et.al. | 2303.06069v1 | null |
2023-03-09 | Scaling up GANs for Text-to-Image Synthesis | Minguk Kang et.al. | 2303.05511v1 | null |
2023-03-09 | Planning with Large Language Models for Code Generation | Shun Zhang et.al. | 2303.05510v1 | null |
2023-03-09 | Cherry-Picking with Reinforcement Learning | Yunchu Zhang et.al. | 2303.05508v1 | null |
2023-03-09 | TANGOS: Regularizing Tabular Neural Networks through Gradient Orthogonalization and Specialization | Alan Jeffares et.al. | 2303.05506v1 | link |
2023-03-09 | Open-world Instance Segmentation: Top-down Learning with Bottom-up Supervision | Tarun Kalluri et.al. | 2303.05503v1 | null |
2023-03-09 | PDSketch: Integrated Planning Domain Programming and Learning | Jiayuan Mao et.al. | 2303.05501v1 | null |
2023-03-10 | Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection | Shilong Liu et.al. | 2303.05499v2 | link |
2023-03-09 | Learning Stationary Markov Processes with Contrastive Adjustment | Ludvig Bergenstråhle et.al. | 2303.05497v1 | link |
2023-03-09 | Sparse and Local Networks for Hypergraph Reasoning | Guangxuan Xiao et.al. | 2303.05496v1 | null |
2023-03-08 | Open-Vocabulary Panoptic Segmentation with Text-to-Image Diffusion Models | Jiarui Xu et.al. | 2303.04803v1 | link |
2023-03-08 | Stabilized profunctors and stable species of structures | Marcelo Fiore et.al. | 2303.04795v1 | null |
2023-03-08 | Multilevel Diffusion: Infinite Dimensional Score-Based Diffusion Models for Image Generation | Paul Hagemann et.al. | 2303.04772v1 | link |
2023-03-08 | SMaLL: A Software Framework for portable Machine Learning Libraries | Upasana Sridhar et.al. | 2303.04769v1 | null |
2023-03-07 | Benign Overfitting for Two-layer ReLU Networks | Yiwen Kou et.al. | 2303.04145v1 | link |
2023-03-07 | Toward Defining a Domain Complexity Measure Across Domains | Katarina Doctor et.al. | 2303.04141v1 | null |
2023-03-07 | Diffusion Policy: Visuomotor Policy Learning via Action Diffusion | Cheng Chi et.al. | 2303.04137v1 | null |
2023-03-07 | Inadequacy of equivalent circuits in nonlinear systems with inherent memory | V. Lopez-Richard et.al. | 2303.04135v1 | null |
2023-03-07 | Exploiting Asymmetry for Synthetic Training Data Generation: SynthIE and the Case of Information Extraction | Martin Josifoski et.al. | 2303.04132v1 | link |
2023-03-07 | Foundation Models for Decision Making: Problems, Methods, and Opportunities | Sherry Yang et.al. | 2303.04129v1 | null |
2023-03-07 | Private Read-Update-Write with Controllable Information Leakage for Storage-Efficient Federated Learning with Top $r$ Sparsification | Sajani Vithana et.al. | 2303.04123v1 | null |
2023-03-06 | Restoration-Degradation Beyond Linear Diffusions: A Non-Asymptotic Analysis For DDIM-Type Samplers | Sitan Chen et.al. | 2303.03384v1 | null |
2023-03-06 | SUREL+: Moving from Walks to Sets for Scalable Subgraph-based Graph Representation Learning | Haoteng Yin et.al. | 2303.03379v1 | link |
2023-03-06 | PaLM-E: An Embodied Multimodal Language Model | Danny Driess et.al. | 2303.03378v1 | null |
2023-03-06 | MAESTRO: Open-Ended Environment Design for Multi-Agent Reinforcement Learning | Mikayel Samvelyan et.al. | 2303.03376v1 | null |
2023-03-06 | Detecting Human-Object Contact in Images | Yixin Chen et.al. | 2303.03373v1 | link |
2023-03-06 | ALMOST: Adversarial Learning to Mitigate Oracle-less ML Attacks via Synthesis Tuning | Animesh Basak Chowdhury et.al. | 2303.03372v1 | null |
2023-03-06 | Complex Systems of Secrecy: The Offshore Networks of Oligarchs | Ho-Chun Herbert Chang et.al. | 2303.03371v1 | null |
2023-03-06 | Multimodal Prompting with Missing Modalities for Visual Recognition | Yi-Lun Lee et.al. | 2303.03369v1 | link |
2023-03-06 | Referring Multi-Object Tracking | Dongming Wu et.al. | 2303.03366v1 | link |
2023-03-06 | Efficient Skill Acquisition for Complex Manipulation Tasks in Obstructed Environments | Jun Yamada et.al. | 2303.03365v1 | null |
2023-03-03 | Unleashing Text-to-Image Diffusion Models for Visual Perception | Wenliang Zhao et.al. | 2303.02153v1 | link |
2023-03-03 | Prompt, Generate, then Cache: Cascade of Foundation Models makes Strong Few-shot Learners | Renrui Zhang et.al. | 2303.02151v1 | link |
2023-03-03 | Sparsity May Cry: Let Us Fail (Current) Sparse Neural Networks Together! | Shiwei Liu et.al. | 2303.02141v1 | link |
2023-03-03 | Eventual Discounting Temporal Logic Counterfactual Experience Replay | Cameron Voloshin et.al. | 2303.02135v1 | null |
2023-03-02 | Dropout Reduces Underfitting | Zhuang Liu et.al. | 2303.01500v1 | link |
2023-03-02 | Predicting Motion Plans for Articulating Everyday Objects | Arjun Gupta et.al. | 2303.01484v1 | null |
2023-03-02 | Faster exact and approximation algorithms for packing and covering matroids via push-relabel | Kent Quanrud et.al. | 2303.01478v1 | null |
2023-03-01 | StraIT: Non-autoregressive Generation with Stratified Image Transformer | Shengju Qian et.al. | 2303.00750v1 | null |
2023-03-01 | Coordination of Multiple Robots along Given Paths with Bounded Junction Complexity | Mikkel Abrahamsen et.al. | 2303.00745v1 | null |
2023-03-01 | READ Avatars: Realistic Emotion-controllable Audio Driven Avatars | Jack Saunders et.al. | 2303.00744v1 | null |
2023-03-01 | R-U-SURE? Uncertainty-Aware Code Suggestions By Maximizing Utility Across Random User Intents | Daniel D. Johnson et.al. | 2303.00732v1 | link |
2023-03-01 | A Systematic Analysis of Vocabulary and BPE Settings for Optimal Fine-tuning of NMT: A Case Study of In-domain Translation | J. Pourmostafa Roshan Sharami et.al. | 2303.00722v1 | null |
2023-02-28 | An Efficient Tester-Learner for Halfspaces | Aravind Gollakota et.al. | 2302.14853v1 | null |
2023-02-27 | Internet Explorer: Targeted Representation Learning on the Open Web | Alexander C. Li et.al. | 2302.14051v1 | link |
2023-02-27 | Language Is Not All You Need: Aligning Perception with Language Models | Shaohan Huang et.al. | 2302.14045v1 | link |
2023-02-27 | Permutation Equivariant Neural Functionals | Allan Zhou et.al. | 2302.14040v1 | link |
2023-02-27 | Measurement of Orbital Angular Momentum of Light using Stokes Parameters and Barnett’s Formalism | Anirban Debnath et.al. | 2302.14025v1 | null |
2023-02-27 | Diacritic Recognition Performance in Arabic ASR | Hanan Aldarmaki et.al. | 2302.14022v1 | null |
2023-02-27 | Full Stack Optimization of Transformer Inference: a Survey | Sehoon Kim et.al. | 2302.14017v1 | null |
2023-02-24 | SplineCam: Exact Visualization and Characterization of Deep Network Geometry and Decision Boundaries | Ahmed Imtiaz Humayun et.al. | 2302.12828v1 | link |
2023-02-24 | Generative Models of Huge Objects | Lunjia Hu et.al. | 2302.12823v1 | null |
2023-02-24 | Automatic Prompt Augmentation and Selection with Chain-of-Thought from Labeled Data | KaShun Shum et.al. | 2302.12822v1 | link |
2023-02-24 | GraphSR: A Data Augmentation Algorithm for Imbalanced Node Classification | Mengting Zhou et.al. | 2302.12814v1 | null |
2023-02-24 | Check Your Facts and Try Again: Improving Large Language Models with External Knowledge and Automated Feedback | Baolin Peng et.al. | 2302.12813v1 | null |
2023-02-23 | Change is Hard: A Closer Look at Subpopulation Shift | Yuzhe Yang et.al. | 2302.12254v1 | link |
2023-02-23 | Boosting Adversarial Transferability using Dynamic Cues | Muzammal Naseer et.al. | 2302.12252v1 | null |
2023-02-23 | VoxFormer: Sparse Voxel Transformer for Camera-based 3D Semantic Scene Completion | Yiming Li et.al. | 2302.12251v1 | link |
2023-02-23 | Sequence-Based Incremental Concolic Testing of RTL Models | Hasini Witharana et.al. | 2302.12241v1 | null |
2023-02-23 | What makes a language easy to deep-learn? | Lukas Galke et.al. | 2302.12239v1 | link |
2023-02-23 | Improving Adaptive Conformal Prediction Using Self-Supervised Learning | Nabeel Seedat et.al. | 2302.12238v1 | link |
2023-02-23 | Learning Neural Volumetric Representations of Dynamic Humans in Minutes | Chen Geng et.al. | 2302.12237v1 | link |
2023-02-23 | DiffusioNeRF: Regularizing Neural Radiance Fields with Denoising Diffusion Models | Jamie Wynn et.al. | 2302.12231v1 | link |
2023-02-22 | Beyond optimal disturbances: a statistical framework for transient growth | Peter Frame et.al. | 2302.11564v1 | null |
2023-02-22 | Uncovering Bias in Face Generation Models | Cristian Muñoz et.al. | 2302.11562v1 | null |
2023-02-22 | Equivariant Polynomials for Graph Neural Networks | Omri Puny et.al. | 2302.11556v1 | null |
2023-02-22 | RoboNinja: Learning an Adaptive Cutting Policy for Multi-Material Objects | Zhenjia Xu et.al. | 2302.11553v1 | null |
2023-02-22 | Reduce, Reuse, Recycle: Compositional Generation with Energy-Based Diffusion Models and MCMC | Yilun Du et.al. | 2302.11552v1 | link |
2023-02-22 | Scaling Robot Learning with Semantically Imagined Experience | Tianhe Yu et.al. | 2302.11550v1 | null |
2023-02-21 | Some Fundamental Aspects about Lipschitz Continuity of Neural Network Functions | Grigory Khromov et.al. | 2302.10886v1 | null |
2023-02-21 | Context-Aware Timewise VAEs for Real-Time Vehicle Trajectory Prediction | Pei Xu et.al. | 2302.10873v1 | link |
2023-02-21 | Efficient CTC Regularization via Coarse Labels for End-to-End Speech Translation | Biao Zhang et.al. | 2302.10871v1 | link |
2023-02-21 | Provable Copyright Protection for Generative Models | Nikhil Vyas et.al. | 2302.10870v1 | null |
2023-02-21 | A Unifying Perspective on Multi-Calibration: Unleashing Game Dynamics for Multi-Objective Learning | Nika Haghtalab et.al. | 2302.10863v1 | null |
2023-02-20 | Towards Universal Fake Image Detectors that Generalize Across Generative Models | Utkarsh Ojha et.al. | 2302.10174v1 | link |
2023-02-20 | Identity-Based Attribute Prototypes Distinguish Communities on Twitter | Thomas Magelinski et.al. | 2302.10172v1 | null |
2023-02-20 | Compressed Error HARQ: Feedback Communication on Noise-Asymmetric Channels | Sravan Kumar Ankireddy et.al. | 2302.10170v1 | link |
2023-02-20 | Learning Deep Semantics for Test Completion | Pengyu Nie et.al. | 2302.10166v1 | link |
2023-02-20 | Sparse PCA Beyond Covariance Thresholding | Gleb Novikov et.al. | 2302.10158v1 | null |
2023-02-17 | Consistent Diffusion Models: Mitigating Sampling Drift by Learning to be Consistent | Giannis Daras et.al. | 2302.09057v1 | link |
2023-02-17 | Geometric description of clustering in directed networks | Antoine Allard et.al. | 2302.09055v1 | link |
2023-02-17 | MiDi: Mixed Graph and 3D Denoising Diffusion for Molecule Generation | Clement Vignac et.al. | 2302.09048v1 | link |
2023-02-17 | From User Perceptions to Technical Improvement: Enabling People Who Stuter to Beter Use Speech Recognition | Colin Lea et.al. | 2302.09044v1 | null |
2023-02-17 | Privately Customizing Prefinetuning to Better Match User Data in Federated Learning | Charlie Hou et.al. | 2302.09042v1 | null |
2023-02-16 | Text-driven Visual Synthesis with Latent Diffusion Prior | Ting-Hsuan Liao et.al. | 2302.08510v1 | null |
2023-02-16 | 3D-aware Conditional Image Synthesis | Kangle Deng et.al. | 2302.08509v1 | link |
2023-02-16 | The Scope of Multicalibration: Characterizing Multicalibration via Property Elicitation | Georgy Noarov et.al. | 2302.08507v1 | null |
2023-02-15 | Target Specific De Novo Design of Drug Candidate Molecules with Graph Transformer-based Generative Adversarial Networks | Atabey Ünlü et.al. | 2302.07868v1 | link |
2023-02-15 | Learning Performance-Improving Code Edits | Aman Madaan et.al. | 2302.07867v1 | link |
2023-02-15 | Dataset Interfaces: Diagnosing Model Failures Using Controllable Counterfactual Generation | Joshua Vendrow et.al. | 2302.07865v1 | link |
2023-02-15 | Big Little Transformer Decoder | Sehoon Kim et.al. | 2302.07863v1 | link |
2023-02-15 | One-Shot Face Video Re-enactment using Hybrid Latent Spaces of StyleGAN2 | Trevine Oorloff et.al. | 2302.07848v1 | null |
2023-02-15 | NL2CMD: An Updated Workflow for Natural Language to Bash Commands Translation | Quchen Fu et.al. | 2302.07845v1 | link |
2023-02-14 | Where to Diffuse, How to Diffuse, and How to Get Back: Automated Learning for Multivariate Diffusions | Raghav Singhal et.al. | 2302.07261v1 | null |
2023-02-14 | ChatCAD: Interactive Computer-Aided Diagnosis on Medical Image using Large Language Models | Sheng Wang et.al. | 2302.07257v1 | link |
2023-02-14 | Energy Transformer | Benjamin Hoover et.al. | 2302.07253v1 | link |
2023-02-14 | Generation Probabilities Are Not Enough: Exploring the Effectiveness of Uncertainty Highlighting in AI-Powered Code Completions | Helena Vasconcelos et.al. | 2302.07248v1 | null |
2023-02-14 | A Deep Probabilistic Spatiotemporal Framework for Dynamic Graph Representation Learning with Application to Brain Disorder Identification | Junn Yong Loo et.al. | 2302.07243v1 | null |
2023-02-14 | Parker Solar Probe Observations of High Plasma Beta Solar Wind from Streamer Belt | Jia Huang et.al. | 2302.07230v1 | null |
2023-02-13 | 3D-aware Blending with Generative NeRFs | Hyunsu Kim et.al. | 2302.06608v1 | link |
2023-02-13 | Generative Adversarial Equilibrium Solvers | Denizalp Goktas et.al. | 2302.06607v1 | null |
2023-02-13 | Breaking the Curse of Multiagency: Provably Efficient Decentralized Multi-Agent RL with Function Approximation | Yuanhao Wang et.al. | 2302.06606v1 | null |
2023-02-13 | FilFL: Accelerating Federated Learning via Client Filtering | Fares Fourati et.al. | 2302.06599v1 | null |
2023-02-13 | The Impact of AI on Developer Productivity: Evidence from GitHub Copilot | Sida Peng et.al. | 2302.06590v1 | null |
2023-02-13 | Improving Out-of-Distribution Generalization of Neural Rerankers with Contextualized Late Interaction | Xinyu Zhang et.al. | 2302.06589v1 | null |
2023-02-13 | Raising the Cost of Malicious AI-Powered Image Editing | Hadi Salman et.al. | 2302.06588v1 | link |
2023-02-13 | AbLit: A Resource for Analyzing and Generating Abridged Versions of English Literature | Melissa Roemmele et.al. | 2302.06579v1 | link |
2023-02-10 | Project and Probe: Sample-Efficient Domain Adaptation by Interpolating Orthogonal Features | Annie S. Chen et.al. | 2302.05441v1 | null |
2023-02-09 | RelightableHands: Efficient Neural Relighting of Articulated Hand Models | Shun Iwase et.al. | 2302.04866v1 | null |
2023-02-09 | Polynomial Neural Fields for Subband Decomposition and Manipulation | Guandao Yang et.al. | 2302.04862v1 | link |
2023-02-09 | Re-ViLM: Retrieval-Augmented Visual Language Model for Zero and Few-Shot Image Captioning | Zhuolin Yang et.al. | 2302.04858v1 | null |
2023-02-09 | One-shot Visual Imitation via Attributed Waypoints and Demonstration Augmentation | Matthew Chang et.al. | 2302.04856v1 | null |
2023-02-09 | SparseProp: Efficient Sparse Backpropagation for Faster Training of Neural Networks | Mahdi Nikdan et.al. | 2302.04852v1 | link |
2023-02-09 | Robot Synesthesia: A Sound and Emotion Guided AI Painter | Vihaan Misra et.al. | 2302.04850v1 | link |
2023-02-09 | Accurate and Interpretable Solution of the Inverse Rig for Realistic Blendshape Models with Quadratic Corrective Terms | Stevo Racković et.al. | 2302.04843v1 | null |
2023-02-09 | Is This Loss Informative? Speeding Up Textual Inversion with Deterministic Objective Evaluation | Anton Voronov et.al. | 2302.04841v1 | link |
2023-02-08 | PFGM++: Unlocking the Potential of Physics-Inspired Generative Models | Yilun Xu et.al. | 2302.04265v1 | link |
2023-02-08 | Learning How to Infer Partial MDPs for In-Context Adaptation and Exploration | Chentian Jiang et.al. | 2302.04250v1 | null |
2023-02-08 | Federated Minimax Optimization with Client Heterogeneity | Pranay Sharma et.al. | 2302.04249v1 | null |
2023-02-08 | Shortcut Detection with Variational Autoencoders | Nicolas M. Müller et.al. | 2302.04246v1 | link |
2023-02-07 | Long Horizon Temperature Scaling | Andy Shih et.al. | 2302.03686v1 | link |
2023-02-07 | Linear Partial Monitoring for Sequential Decision-Making: Algorithms, Regret Bounds and Applications | Johannes Kirschner et.al. | 2302.03683v1 | null |
2023-02-07 | Auditing Gender Presentation Differences in Text-to-Image Models | Yanzhe Zhang et.al. | 2302.03675v1 | link |
2023-02-07 | Proportionality in Approval-Based Participatory Budgeting | Markus Brill et.al. | 2302.03672v1 | null |
2023-02-07 | Hard Prompts Made Easy: Gradient-Based Discrete Optimization for Prompt Tuning and Discovery | Yuxin Wen et.al. | 2302.03668v1 | link |
2023-02-07 | HumanMAC: Masked Motion Completion for Human Motion Prediction | Ling-Hao Chen et.al. | 2302.03665v1 | link |
2023-02-07 | SDYN-GANs: Adversarial Learning Methods for Multistep Generative Models for General Order Stochastic Dynamics | Panos Stinis et.al. | 2302.03663v1 | null |
2023-02-06 | Zero-shot Image-to-Image Translation | Gaurav Parmar et.al. | 2302.03027v1 | link |
2023-02-06 | AIM: Adapting Image Models for Efficient Video Action Recognition | Taojiannan Yang et.al. | 2302.03024v1 | null |
2023-02-06 | Geometry of contact: contact planning for multi-legged robots via spin models duality | Baxi Chong et.al. | 2302.03019v1 | null |
2023-02-06 | Structure and Content-Guided Video Synthesis with Diffusion Models | Patrick Esser et.al. | 2302.03011v1 | null |
2023-02-06 | A novel Doppler backscattering (DBS) system to simultaneously monitor radio frequency plasma fluctuations and low frequency turbulence | S. Chowdhury et.al. | 2302.03009v1 | null |
2023-02-03 | Understanding the Issues, Their Causes and Solutions in Microservices Systems: An Empirical Study | Muhammad Waseem et.al. | 2302.01894v1 | null |
2023-02-03 | Enhancing Once-For-All: A Study on Parallel Blocks, Skip Connections and Early Exits | Simone Sarti et.al. | 2302.01888v1 | null |
2023-02-03 | Analyzing the impact of climate change on critical infrastructure from the scientific literature: A weakly supervised NLP approach | Tanwi Mallick et.al. | 2302.01887v1 | null |
2023-02-03 | LIDAR-based Stabilization, Navigation and Localization for UAVs Operating in Dark Indoor Environments | Matěj Petrl' ik et.al. | 2302.01883v1 | null |
2023-02-03 | IKEA-Manual: Seeing Shape Assembly Step by Step | Ruocheng Wang et.al. | 2302.01881v1 | null |
2023-02-02 | STEPS: Joint Self-supervised Nighttime Image Enhancement and Depth Estimation | Yupeng Zheng et.al. | 2302.01334v1 | link |
2023-02-02 | Bayesian Metric Learning for Uncertainty Quantification in Image Retrieval | Frederik Warburg et.al. | 2302.01332v1 | link |
2023-02-02 | SceneDreamer: Unbounded 3D Scene Generation from 2D Image Collections | Zhaoxi Chen et.al. | 2302.01330v1 | link |
2023-02-02 | Dreamix: Video Diffusion Models are General Video Editors | Eyal Molad et.al. | 2302.01329v1 | null |
2023-02-02 | $IC^3$ : Image Captioning by Committee Consensus | David M. Chan et.al. | 2302.01328v1 | link |
2023-02-02 | Randomized Greedy Learning for Non-monotone Stochastic Submodular Maximization Under Full-bandit Feedback | Fares Fourati et.al. | 2302.01324v1 | null |
2023-02-02 | Signatures for strong-field QED physics in the quantum limit of beamstrahlung | W. L. Zhang et.al. | 2302.01321v1 | null |
2023-02-01 | Improving Few-Shot Generalization by Exploring and Exploiting Auxiliary Data | Alon Albalak et.al. | 2302.00674v1 | link |
2023-02-01 | ‘Generative CI’ through Collective Response Systems | Aviv Ovadya et.al. | 2302.00672v1 | null |
2023-02-01 | Efficient Multi-Task Reinforcement Learning via Selective Behavior Sharing | Grace Zhang et.al. | 2302.00671v1 | null |
2023-02-01 | Stable Target Field for Reduced Variance Score Estimation in Diffusion Models | Yilun Xu et.al. | 2302.00670v1 | link |
2023-02-01 | Does Vision Accelerate Hierarchical Generalization of Neural Language Learners? | Tatsuki Kuribayashi et.al. | 2302.00667v1 | null |
2023-02-01 | Extrinsic Calibration of 2D mm-Wavelength Radar Pairs Using Ego-Velocity Estimates | Qilong Cheng et.al. | 2302.00660v1 | null |
2023-02-01 | Graph Neural Operators for Classification of Spatial Transcriptomics Data | Junaid Ahmed et.al. | 2302.00658v1 | null |
2023-01-31 | Reverse engineering adversarial attacks with fingerprints from adversarial examples | David Aaron Nicholson et.al. | 2301.13869v1 | null |
2023-01-31 | PADL: Language-Directed Physics-Based Character Control | Jordan Juravsky et.al. | 2301.13868v1 | link |
2023-01-31 | Zero-Memory Graph Exploration with Unknown Inports | Hans-Joachim Böckenhauer et.al. | 2301.13860v1 | null |
2023-01-31 | Interpreting Robustness Proofs of Deep Neural Networks | Debangshu Banerjee et.al. | 2301.13845v1 | null |
2023-01-31 | Do Multi-Document Summarization Models Synthesize? | Jay DeYoung et.al. | 2301.13844v1 | null |
2023-01-31 | RIS-Assisted Interference Mitigation for Uplink NOMA | Azadeh Tabeshnezhad et.al. | 2301.13841v1 | null |
2023-01-30 | Looped Transformers as Programmable Computers | Angeliki Giannou et.al. | 2301.13196v1 | null |
2023-01-30 | Adaptive Computation with Elastic Input Sequence | Fuzhao Xue et.al. | 2301.13195v1 | link |
2023-01-30 | Audio-Visual Segmentation with Semantics | Jinxing Zhou et.al. | 2301.13190v1 | link |
2023-01-30 | Extracting Training Data from Diffusion Models | Nicholas Carlini et.al. | 2301.13188v1 | null |
2023-01-30 | Weighted flow diffusion for local graph clustering with node attributes: an algorithm and statistical guarantees | Shenghao Yang et.al. | 2301.13187v1 | link |
2023-01-30 | Optimal Decision Tree Policies for Markov Decision Processes | Daniël Vos et.al. | 2301.13185v1 | link |
2023-01-27 | Incorporating Background Knowledge in Symbolic Regression using a Computer Algebra System | Charles Fox et.al. | 2301.11919v1 | null |
2023-01-27 | OccRob: Efficient SMT-Based Occlusion Robustness Verification of Deep Neural Networks | Xingwu Guo et.al. | 2301.11912v1 | null |
2023-01-27 | Multi-dimensional concept discovery (MCD): A unifying framework with completeness guarantees | Johanna Vielhaben et.al. | 2301.11911v1 | link |
2023-01-27 | Tree-structured Policy Planning with Learned Behavior Models | Yuxiao Chen et.al. | 2301.11902v1 | null |
2023-01-26 | Conservative Safety Monitors of Stochastic Dynamical Systems | Matthew Cleaveland et.al. | 2301.11330v1 | null |
2023-01-26 | MusicLM: Generating Music From Text | Andrea Agostinelli et.al. | 2301.11325v1 | null |
2023-01-26 | Joint Training of Deep Ensembles Fails Due to Learner Collusion | Alan Jeffares et.al. | 2301.11323v1 | null |
2023-01-26 | Cut and Learn for Unsupervised Object Detection and Instance Segmentation | Xudong Wang et.al. | 2301.11320v1 | link |
2023-01-26 | Learning Good Features to Transfer Across Tasks and Domains | Pierluigi Zama Ramirez et.al. | 2301.11310v1 | null |
2023-01-26 | SemSup-XC: Semantic Supervision for Zero and Few-shot Extreme Classification | Pranjal Aggarwal et.al. | 2301.11309v1 | link |
2023-01-26 | Neural Continuous-Discrete State Space Models for Irregularly-Sampled Time Series | Abdul Fatir Ansari et.al. | 2301.11308v1 | link |
2023-01-26 | DetectGPT: Zero-Shot Machine-Generated Text Detection using Probability Curvature | Eric Mitchell et.al. | 2301.11305v1 | link |
2023-01-25 | Fillers in Spoken Language Understanding: Computational and Psycholinguistic Perspectives | Tanvi Dinkar et.al. | 2301.10761v1 | null |
2023-01-25 | Efficient Flow-Guided Multi-frame De-fencing | Stavros Tsogkas et.al. | 2301.10759v1 | null |
2023-01-25 | Room-Temperature Sputtered Ultralow-loss Silicon Nitride for Hybrid Photonic Integration | Shuangyou Zhang et.al. | 2301.10758v1 | null |
2023-01-25 | Generating large-scale network analyses of scientific landscapes in seconds using Dimensions on Google BigQuery | Michele Pasin et.al. | 2301.10736v1 | null |
2023-01-25 | The Synchronic Web | Thien-Nam Dinh et.al. | 2301.10733v1 | null |
2023-01-24 | A Watermark for Large Language Models | John Kirchenbauer et.al. | 2301.10226v1 | link |
2023-01-24 | Evolution of cooperation under a generalized death-birth process | Chaoqian Wang et.al. | 2301.10205v1 | null |
2023-01-24 | A general epidemic model and its application to mask design considering different preferences towards masks | Chaoqian Wang et.al. | 2301.10202v1 | null |
2023-01-23 | InfiniCity: Infinite-Scale City Synthesis | Chieh Hubert Lin et.al. | 2301.09637v1 | null |
2023-01-23 | Feature construction using explanations of individual predictions | Boštjan Vouk et.al. | 2301.09631v1 | null |
2023-01-23 | Tracking the industrial growth of modern China with high-resolution panchromatic imagery: A sequential convolutional approach | Ethan Brewer et.al. | 2301.09620v1 | null |
2023-01-23 | Asymptotic Convergence and Performance of Multi-Agent Q-Learning Dynamics | Aamal Abbas Hussain et.al. | 2301.09619v1 | null |
2023-01-20 | The stochastic digital human is now enrolling for in silico imaging trials – Methods and tools for generating digital cohorts | A Badano et.al. | 2301.08719v1 | null |
2023-01-20 | Massively Parallel Genetic Optimization through Asynchronous Propagation of Populations | Oskar Taubert et.al. | 2301.08713v1 | link |
2023-01-19 | Multiview Compressive Coding for 3D Reconstruction | Chao-Yuan Wu et.al. | 2301.08247v1 | link |
2023-01-19 | Booster: a Benchmark for Depth from Images of Specular and Transparent Surfaces | Pierluigi Zama Ramirez et.al. | 2301.08245v1 | null |
2023-01-19 | Self-Supervised Learning from Images with a Joint-Embedding Predictive Architecture | Mahmoud Assran et.al. | 2301.08243v1 | link |
2023-01-19 | Radiation-induced secondary emissions in solid-state devices as a possible contribution to quasiparticle poisoning of superconducting circuits | Francisco Ponce et.al. | 2301.08239v1 | null |
2023-01-18 | Robust Zero-crossings Detection in Noisy Signals using Topological Signal Processing | Sunia Tanweer et.al. | 2301.07703v1 | null |
2023-01-18 | Learning 3D-aware Image Synthesis with Unknown Pose Distribution | Zifan Shi et.al. | 2301.07702v1 | null |
2023-01-18 | Prony-Based Super-Resolution Phase Retrieval of Sparse, Multivariate Signals | Robert Beinert et.al. | 2301.07696v1 | null |
2023-01-18 | Private Federated Submodel Learning via Private Set Union | Zhusheng Wang et.al. | 2301.07686v1 | null |
2023-01-18 | SFQEDtoolkit: a high-performance library for the accurate modeling of strong-field QED processes in PIC and Monte Carlo codes | Samuele Montefiori et.al. | 2301.07684v1 | link |
2023-01-18 | OmniObject3D: Large-Vocabulary 3D Object Dataset for Realistic Perception, Reconstruction and Generation | Tong Wu et.al. | 2301.07525v1 | null |
2023-01-17 | Three Dimensional Odd Viscosity in Ferrofluids with Vorticity-Magnetization Coupling | Dylan Reynolds et.al. | 2301.07096v1 | null |
2023-01-17 | On the State of German (Abstractive) Text Summarization | Dennis Aumiller et.al. | 2301.07095v1 | link |
2023-01-17 | Learning Customized Visual Models with Retrieval-Augmented Knowledge | Haotian Liu et.al. | 2301.07094v1 | link |
2023-01-17 | GLIGEN: Open-Set Grounded Text-to-Image Generation | Yuheng Li et.al. | 2301.07093v1 | link |
2023-01-17 | Vision Learners Meet Web Image-Text Pairs | Bingchen Zhao et.al. | 2301.07088v1 | null |
2023-01-17 | MooseNet: A trainable metric for synthesized speech with plda backend | Ondřej Plátek et.al. | 2301.07087v1 | link |
2023-01-17 | Transformers as Algorithms: Generalization and Implicit Model Selection in In-context Learning | Yingcong Li et.al. | 2301.07067v1 | link |
2023-01-13 | Non-Stochastic CDF Estimation Using Threshold Queries | Princewill Okoroafor et.al. | 2301.05682v1 | null |
2023-01-12 | See, Think, Confirm: Interactive Prompting Between Vision and Language Models for Knowledge-based Visual Reasoning | Zhenfang Chen et.al. | 2301.05226v1 | null |
2023-01-12 | Domain Expansion of Image Generators | Yotam Nitzan et.al. | 2301.05225v1 | null |
2023-01-12 | Guiding Text-to-Image Diffusion Model Towards Grounded Generation | Ziyi Li et.al. | 2301.05221v1 | null |
2023-01-12 | Adversarial Adaptation for French Named Entity Recognition | Arjun Choudhry et.al. | 2301.05220v1 | link |
2023-01-12 | NDNSD: Service Publishing and Discovery in NDN | Saurab Dulal et.al. | 2301.05218v1 | null |
(<a href=#Updated-on-20240404>back to top</a>)