Ep. 239 - Part 2 - June 5, 2024

https://is1-ssl.mzstatic.com/image/thumb/Podcasts126/v4/4a/9c/ef/4a9ceff8-5c1a-e15c-62d9-6360c52cd38a/mza_2283181023971434852.jpg/600x600bb.jpg

TechcraftingAI Computer Vision

Brad Edwards

315 episodes

5 hours ago

TechcraftingAI Computer Vision brings you summaries of the latest arXiv research daily. Research is read by your virtual host, Sage. The podcast is produced by Brad Edwards, an AI Engineer from Vancouver, BC, and a graduate student of computer science studying AI at the University of York. Thank you to arXiv for use of its open access interoperability.

Technology

RSS

All content for TechcraftingAI Computer Vision is the property of Brad Edwards and is served directly from their servers with no modification, redirects, or rehosting. The podcast is not affiliated with or endorsed by Podjoint in any way.

Technology

https://d3t3ozftmdmh3i.cloudfront.net/staging/podcast_uploaded_nologo/39305030/39305030-1703089970889-aab16cf4a6955.jpg

Ep. 239 - Part 2 - June 5, 2024

TechcraftingAI Computer Vision

35 minutes 39 seconds

1 year ago

Ep. 239 - Part 2 - June 5, 2024

ArXiv Computer Vision research for Wednesday, June 05, 2024.

00:20: Adapter-X: A Novel General Parameter-Efficient Fine-Tuning Framework for Vision

02:03: A-Bench: Are LMMs Masters at Evaluating AI-generated Images?

03:42: Exploiting LMM-based knowledge for image classification tasks

04:37: EgoSurgery-Tool: A Dataset of Surgical Tool and Hand Detection from Egocentric Open Surgery Videos

06:09: EpidermaQuant: Unsupervised detection and quantification of epidermal differentiation markers on H-DAB-stained images of reconstructed human epidermis

08:15: Enhancing 3D Lane Detection and Topology Reasoning with 2D Lane Priors

09:24: VQUNet: Vector Quantization U-Net for Defending Adversarial Atacks by Regularizing Unwanted Noise

10:36: Enhanced Automotive Object Detection via RGB-D Fusion in a DiffusionDet Framework

11:42: ZeroPur: Succinct Training-Free Adversarial Purification

13:23: Tiny models from tiny data: Textual and null-text inversion for few-shot distillation

15:10: Multi-Task Multi-Scale Contrastive Knowledge Distillation for Efficient Medical Image Segmentation

16:44: Dynamic 3D Gaussian Fields for Urban Areas

18:10: MMCL: Boosting Deformable DETR-Based Detectors with Multi-Class Min-Margin Contrastive Learning for Superior Prohibited Item Detection

20:02: FAPNet: An Effective Frequency Adaptive Point-based Eye Tracker

21:52: Ouroboros3D: Image-to-3D Generation via 3D-aware Recursive Diffusion

23:14: Situation Monitor: Diversity-Driven Zero-Shot Out-of-Distribution Detection using Budding Ensemble Architecture for Object Detection

24:28: Writing Order Recovery in Complex and Long Static Handwriting

25:50: Identification of Stone Deterioration Patterns with Large Multimodal Models

26:58: Searching Priors Makes Text-to-Video Synthesis Better

28:32: Interactive Image Selection and Training for Brain Tumor Segmentation Network

29:35: Global Clipper: Enhancing Safety and Reliability of Transformer-based Object Detection Models

30:53: Generative Diffusion Models for Fast Simulations of Particle Collisions at CERN

31:52: Prompt-based Visual Alignment for Zero-shot Policy Transfer

33:33: ADer: A Comprehensive Benchmark for Multi-class Visual Anomaly Detection