Home
Categories
EXPLORE
True Crime
Comedy
Society & Culture
Business
Sports
Technology
Health & Fitness
About Us
Contact Us
Copyright
© 2024 PodJoint
Podjoint Logo
US
00:00 / 00:00
Sign in

or

Don't have an account?
Sign up
Forgot password
https://is1-ssl.mzstatic.com/image/thumb/Podcasts126/v4/4a/9c/ef/4a9ceff8-5c1a-e15c-62d9-6360c52cd38a/mza_2283181023971434852.jpg/600x600bb.jpg
TechcraftingAI Computer Vision
Brad Edwards
315 episodes
5 days ago
TechcraftingAI Computer Vision brings you summaries of the latest arXiv research daily. Research is read by your virtual host, Sage. The podcast is produced by Brad Edwards, an AI Engineer from Vancouver, BC, and a graduate student of computer science studying AI at the University of York. Thank you to arXiv for use of its open access interoperability.
Show more...
Technology
RSS
All content for TechcraftingAI Computer Vision is the property of Brad Edwards and is served directly from their servers with no modification, redirects, or rehosting. The podcast is not affiliated with or endorsed by Podjoint in any way.
TechcraftingAI Computer Vision brings you summaries of the latest arXiv research daily. Research is read by your virtual host, Sage. The podcast is produced by Brad Edwards, an AI Engineer from Vancouver, BC, and a graduate student of computer science studying AI at the University of York. Thank you to arXiv for use of its open access interoperability.
Show more...
Technology
https://d3t3ozftmdmh3i.cloudfront.net/staging/podcast_uploaded_nologo/39305030/39305030-1703089970889-aab16cf4a6955.jpg
Ep. 247 - Part 1 - June 13, 2024
TechcraftingAI Computer Vision
48 minutes 20 seconds
1 year ago
Ep. 247 - Part 1 - June 13, 2024

ArXiv Computer Vision research for Thursday, June 13, 2024.


00:21: FouRA: Fourier Low Rank Adaptation

01:41: Hallo: Hierarchical Audio-Driven Visual Synthesis for Portrait Image Animation

03:18: Few-Shot Anomaly Detection via Category-Agnostic Registration Learning

04:57: Skim then Focus: Integrating Contextual and Fine-grained Views for Repetitive Action Counting

06:46: ToSA: Token Selective Attention for Efficient Vision Transformers

08:00: Computer vision-based model for detecting turning lane features on Florida's public roadways

09:08: Improving Adversarial Robustness via Feature Pattern Consistency Constraint

10:52: Research on Deep Learning Model of Feature Extraction Based on Convolutional Neural Network

12:10: NeRF Director: Revisiting View Selection in Neural Volume Rendering

13:36: Conceptual Learning via Embedding Approximations for Reinforcing Interpretability and Transparency

15:03: Rethinking Human Evaluation Protocol for Text-to-Video Models: Enhancing Reliability,Reproducibility, and Practicality

16:40: COVE: Unleashing the Diffusion Feature Correspondence for Consistent Video Editing

18:16: Fusion of regional and sparse attention in Vision Transformers

19:26: Zoom and Shift are All You Need

20:17: EgoExo-Fitness: Towards Egocentric and Exocentric Full-Body Action Understanding

21:49: The Penalized Inverse Probability Measure for Conformal Classification

23:24: OpenMaterial: A Comprehensive Dataset of Complex Materials for 3D Reconstruction

24:47: Blind Super-Resolution via Meta-learning and Markov Chain Monte Carlo Simulation

26:30: Computer Vision Approaches for Automated Bee Counting Application

27:17: Dual Attribute-Spatial Relation Alignment for 3D Visual Grounding

28:16: A Label-Free and Non-Monotonic Metric for Evaluating Denoising in Event Cameras

29:43: Multiple Prior Representation Learning for Self-Supervised Monocular Depth Estimation via Hybrid Transformer

31:25: Neural NeRF Compression

32:29: Preserving Identity with Variational Score for General-purpose 3D Editing

33:50: AirPlanes: Accurate Plane Estimation via 3D-Consistent Embeddings

34:51: Adaptive Temporal Motion Guided Graph Convolution Network for Micro-expression Recognition

36:10: Enhancing Cross-Modal Fine-Tuning with Gradually Intermediate Modality Generation

37:34: AMSA-UNet: An Asymmetric Multiple Scales U-net Based on Self-attention for Deblurring

38:49: Cross-Modal Learning for Anomaly Detection in Fused Magnesium Smelting Process: Methodology and Benchmark

40:45: A PCA based Keypoint Tracking Approach to Automated Facial Expressions Encoding

42:02: Steganalysis on Digital Watermarking: Is Your Defense Truly Impervious?

43:28: FacEnhance: Facial Expression Enhancing with Recurrent DDPMs

45:11: How structured are the representations in transformer-based vision encoders? An analysis of multi-object representations in vision-language models

47:08: Suitability of KANs for Computer Vision: A preliminary investigation

TechcraftingAI Computer Vision
TechcraftingAI Computer Vision brings you summaries of the latest arXiv research daily. Research is read by your virtual host, Sage. The podcast is produced by Brad Edwards, an AI Engineer from Vancouver, BC, and a graduate student of computer science studying AI at the University of York. Thank you to arXiv for use of its open access interoperability.