Ep. 246 - Part 2 - June 12, 2024

https://is1-ssl.mzstatic.com/image/thumb/Podcasts126/v4/4a/9c/ef/4a9ceff8-5c1a-e15c-62d9-6360c52cd38a/mza_2283181023971434852.jpg/600x600bb.jpg

TechcraftingAI Computer Vision

Brad Edwards

315 episodes

7 hours ago

TechcraftingAI Computer Vision brings you summaries of the latest arXiv research daily. Research is read by your virtual host, Sage. The podcast is produced by Brad Edwards, an AI Engineer from Vancouver, BC, and a graduate student of computer science studying AI at the University of York. Thank you to arXiv for use of its open access interoperability.

Technology

RSS

All content for TechcraftingAI Computer Vision is the property of Brad Edwards and is served directly from their servers with no modification, redirects, or rehosting. The podcast is not affiliated with or endorsed by Podjoint in any way.

Technology

https://d3t3ozftmdmh3i.cloudfront.net/staging/podcast_uploaded_nologo/39305030/39305030-1703089970889-aab16cf4a6955.jpg

Ep. 246 - Part 2 - June 12, 2024

TechcraftingAI Computer Vision

43 minutes 29 seconds

1 year ago

Ep. 246 - Part 2 - June 12, 2024

ArXiv Computer Vision research for Wednesday, June 12, 2024.

00:21: From Sim-to-Real: Toward General Event-based Low-light Frame Interpolation with Per-scene Optimization

01:44: Make Your Actor Talk: Generalizable and High-Fidelity Lip Sync with Motion and Appearance Disentanglement

03:20: Adversarial Patch for 3D Local Feature Extractor

04:00: Valeo4Cast: A Modular Approach to End-to-End Forecasting

05:38: The impact of deep learning aid on the workload and interpretation accuracy of radiologists on chest computed tomography: a cross-over reader study

08:50: Universal Scale Laws for Colors and Patterns in Imagery

10:11: CT3D++: Improving 3D Object Detection with Keypoint-induced Channel-wise Transformer

11:44: ConMe: Rethinking Evaluation of Compositional Reasoning for Modern VLMs

13:25: Continuous fake media detection: adapting deepfake detectors to new generative techniques

15:18: Category-level Neural Field for Reconstruction of Partially Observed Objects in Indoor Environment

16:23: One-Step Effective Diffusion Network for Real-World Image Super-Resolution

18:12: 2nd Place Solution for MOSE Track in CVPR 2024 PVUW workshop: Complex Video Object Segmentation

19:22: Diffusion-Promoted HDR Video Reconstruction

21:09: Runtime Freezing: Dynamic Class Loss for Multi-Organ 3D Segmentation

21:52: A Sociotechnical Lens for Evaluating Computer Vision Models: A Case Study on Detecting and Reasoning about Gender and Emotion

23:54: DistilDoc: Knowledge Distillation for Visually-Rich Document Applications

25:28: Using Deep Convolutional Neural Networks to Detect Rendered Glitches in Video Games

26:39: OpenCOLE: Towards Reproducible Automatic Graphic Design Generation

27:23: Dataset Enhancement with Instance-Level Augmentations

28:33: Interpretable Representation Learning of Cardiac MRI via Attribute Regularization

29:33: A New Class Biorthogonal Spline Wavelet for Image Edge Detection

30:48: Outdoor Scene Extrapolation with Hierarchical Generative Cellular Automata

32:10: Vessel Re-identification and Activity Detection in Thermal Domain for Maritime Surveillance

33:32: AdaNCA: Neural Cellular Automata As Adaptors For More Robust Vision Transformer

35:09: From Chaos to Clarity: 3DGS in the Dark

36:32: LaMOT: Language-Guided Multi-Object Tracking

38:07: UDON: Universal Dynamic Online distillatioN for generic image representations

39:49: WMAdapter: Adding WaterMark Control to Latent Diffusion Models

40:48: Blind Image Deblurring using FFT-ReLU with Deep Learning Pipeline Integration

42:06: DocSynthv2: A Practical Autoregressive Modeling for Document Generation