Home
Categories
EXPLORE
True Crime
Comedy
Society & Culture
Business
Sports
Health & Fitness
Technology
About Us
Contact Us
Copyright
© 2024 PodJoint
00:00 / 00:00
Podjoint Logo
US
Sign in

or

Don't have an account?
Sign up
Forgot password
https://is1-ssl.mzstatic.com/image/thumb/Podcasts221/v4/3d/e6/51/3de65140-1ec1-79a2-d0ec-b8dfe4307a3d/mza_1357625757877950409.jpg/600x600bb.jpg
AI Innovators - By SaladCloud
AI Innovators
17 episodes
6 months ago
Show more...
Technology
RSS
All content for AI Innovators - By SaladCloud is the property of AI Innovators and is served directly from their servers with no modification, redirects, or rehosting. The podcast is not affiliated with or endorsed by Podjoint in any way.
Show more...
Technology
Episodes (17/17)
AI Innovators - By SaladCloud
From WhatsApp Bot to a Global AI Profile Pic Brand: The Secta Labs Story | Ep 17 | Marko Jak
In this episode of the AI Innovators Podcast, we are joined by Marko Jak, founder of Secta Labs, to discuss the AI-driven transformation of professional profile pictures. Marko shares the journey of Secta Labs, from its early days as a WhatsApp bot creating fun images to its rise as a go-to for high-quality AI-generated profile pictures on platforms like LinkedIn and Twitter. He talks about challenges in the image generation space, including issues of bias and quality, and how Secta Labs continuously iterates to improve. You'll also hear touching customer stories that highlight the emotional impact of being able to present oneself confidently through AI-generated images. Tune in to learn about the technical innovations behind Secta Labs and the future vision of enhancing memories through advanced AI imagery. 00:00 Introduction to the AI Innovators Podcast 00:10 Meet Marko Jak: Founder of Secta Labs 01:10 The birth of Secta Labs 02:26 From WhatsApp bot to a global AI brand 06:37 Growing an AI tool with influencers and community 08:58 Making $100k in 5 hrs from a LinkedIn post 10:55 Why AI profile pictures is one of the hardest technical problems 20:58 Addressing bias & ethics in profile pictures 26:40 Competing against newer AI tools 32:42 Helping people see themselves better with AI 37:30 Conclusion and Final Thoughts
Show more...
6 months ago
37 minutes 54 seconds

AI Innovators - By SaladCloud
From Dentists to Lawyers: How AI conversational intelligence is helping SMB Marketers - Ep 16
In this episode of the AI Innovators Podcast, we sit down with Ryan Johnson, Chief Product Officer at CallRail, a leading conversational intelligence platform. This podcast is brought to you by SaladCloud (www.salad.com). Ryan shares insights into how CallRail empowers marketers and small to medium-sized businesses by providing deep attribution and conversational analysis. He discusses the origins of CallRail, their unique focus on marketers, and the powerful AI-driven features that make the platform indispensable. From post-call analysis to real-time AI voice agents, Ryan delves into the technology that differentiates CallRail, including the challenges and future of AI in conversation intelligence. Whether you're a product manager, marketer, or business owner, this episode offers valuable lessons on leveraging AI to solve long-standing problems and improve customer interactions. 00:00 Introduction to the AI Innovators Podcast 00:27 Understanding CallRail's Unique Market Position 01:06 The Evolution of CallRail 03:23 The Marketing Automation Landscape 03:52 Conversation Intelligence in Marketing vs. Sales 08:13 Diverse Industries Served by CallRail 10:50 Adoption and Implementation of CallRail 14:54 AI in SMBs vs. Enterprises 19:26 Building Better Products in the Age of AI 21:53 Leveraging AI to Solve Existing Problems 22:19 Challenges and Improvements in AI Transcription 22:49 Advice for Product Managers 24:25 AI Voice Agents and Transcription Accuracy 25:59 Post-Call Analysis and Aggregated Insights 27:41 Industry-Specific Accuracy Requirements 28:40 Real-Time Transcription and Custom Context 33:26 CallRail's Market Leadership and Strategy 36:26 The Value of Context in AI Solutions 41:26 Future Challenges in Conversation Intelligence 43:49 Conclusion and Final Thoughts
Show more...
6 months ago
44 minutes 42 seconds

AI Innovators - By SaladCloud
AI in Retail: Personalization, Ethics, and Scale - Ep 15 | Shashank Kapadia | Walmart Global Tech
In this episode of the AI Innovators podcast, host Prashanth interviews Shashank Kapadia, a staff machine learning engineer at Walmart Global Tech. This episode is brought to you by SaladCloud, the most affordable cloud for AI/ML inference at scale (www.salad.com). Shashank shares insights on his role in developing and operationalizing machine learning models at scale within Walmart. He discusses his extensive experience in machine learning, including pivotal moments that reshaped his approach. They delve into the challenges and learning moments in recommendation systems, the importance of considering user experience holistically, and the phenomenon of the 'next best algorithm syndrome.' Shashank also highlights AI's role in enterprise and balancing innovation with operationalization. The conversation covers the broader integration of AI in business operations, the ethics and fairness of AI systems, and how ethical AI can serve as a business moat. Towards the end, they touch upon the technical and operational considerations for scalable AI platforms and emerging AI technologies promising advancements in personalization. This episode provides a deep dive into the intersection of AI and retail, offering valuable insights for both industry professionals and enthusiasts. 00:00 Introduction and Guest Welcome 00:28 Role and Responsibilities at Walmart 01:10 Key Projects and Learnings in Machine Learning 03:39 Challenges in Implementing AI Models 04:15 Balancing Innovation and Operationalization 07:26 Enterprise vs. Startup AI Implementation 10:22 Transition from Engineer to Leader 16:51 Personalization and Ethical Considerations 31:50 Scalable AI Platforms and Architectural Decisions 38:17 Future of AI in Retail 42:07 Conclusion and Final Thoughts
Show more...
6 months ago
42 minutes 50 seconds

AI Innovators - By SaladCloud
Behind Phoenix: Leonardo AI’s foundational model - Ep 13 | Aninda Saha | Leonardo AI
From PhD to AI Innovator: Aninda Saha discusses joining Leonardo AI's founding research team, the Phoenix foundational model, Leonardo's integration with Canva, tackling bias in AI research and more. This podcast is brought to you by SaladCloud: www.salad.com In this episode of the AI Innovators Podcast, host Prashanth interviews Aninda Saha, a Senior AI Research Engineer at Leonardo, now part of Canva. Aninda shares his journey from completing a PhD to joining Leonardo, helping build their Phoenix model, and the eventual acquisition by Canva. The discussion covers the challenges and innovations in AI generative art, particularly focusing on prompt adherence, knowledge distillation, and the democratization of AI. Aninda also provides insights into the technical aspects of AI research, including issues of diversity, regional tastes, and optimizing generative models. The episode concludes with Aninda's thoughts on the future of AI, including generative gaming and sustainable AI development strategies. 00:00 Introduction and Guest Welcome 00:23 Aninda's Journey from PhD to Leonardo 01:31 Early Days at Leonardo and Startup Experience 04:11 PhD Research and Knowledge Distillation 07:37 Democratization of AI 12:25 Leonardo's Growth and Success 14:51 Technical Challenges and Innovations 19:29 Leonardo's Integration with Canva 32:02 Future of AI and Closing Remarks
Show more...
7 months ago
34 minutes 47 seconds

AI Innovators - By SaladCloud
Trust and Safety for AI Avatars: From Politics to Personal Life - Ep 14 | HeyGen | Lavanya Poreddy
"AI avatars are reshaping elections, content moderation, and even grief. But can we truly trust them?" 🗳️🤖 Lavanya Poreddy from HeyGen reveals the high-stakes world of trust and safety in AI - moderating deepfakes, safeguarding data, and more. Listen now! 🎙️ #AI #DeepfakeDetection Join us for an insightful episode of the AI Innovators Podcast as we sit down with Lavanya Poreddy, Head of the Trust and Safety Division at HeyGen. Lavanya discusses HeyGen's AI avatar technology and its practical applications across various sectors from elections to personal life. She shares fascinating stories and real-world examples, emphasizing the importance of trust and safety in AI innovations. From handling deepfake concerns during the US elections to proactive and reactive moderation strategies, Lavanya reveals the intricate balance of protecting user data, ensuring ethical use, and navigating the challenges of content moderation during political events. This episode not only highlights the groundbreaking work at HeyGen but also provides a closer look at the critical role of trust and safety in the evolving AI landscape. 00:00 Introduction to the AI Innovators Podcast 00:22 Overview of HeyGen's Capabilities 01:00 Use Cases and Market Reach 03:04 Interactive Avatars and Innovation 04:49 Trust and Safety in AI 05:34 Lavanya's Journey in Trust and Safety 12:01 Proactive vs Reactive Moderation 18:05 Content Moderation at HeyGen 26:43 Challenges in Political Content Moderation & Elections 27:37 Voice Recognition and Political Content Monitoring 28:36 Celebrity Voice Detection and Language Translation 29:23 Rapid Response to Harmful Content 30:49 AI Tools and Trust in HeyGen 33:37 Data Privacy and User Safety 38:54 Balancing Innovation and Trust 41:44 Heartwarming Uses of HeyGen 46:22 The Ongoing Challenge of Trust and Safety 49:37 Conclusion and Final Thoughts
Show more...
8 months ago
50 minutes 3 seconds

AI Innovators - By SaladCloud
Undetectable AI’s Journey to 15 Million Users - Christian Perry | AI Innovators podcast | Ep 12 | Salad
In this episode of the AI Innovators Podcast, Christian Perry, the founder of Undetectable.ai, shares the story behind growing to 15 Million users for his AI detection and humanizer tool. This podcast is brought to you by SaladCloud: www.salad.com Christian discusses his journey from his previous venture Chatterquant, the flaws in early AI content detectors, and the vision behind Undetectable.ai. He highlights the challenges and strategies for humanizing AI-generated content, the global impact of AI detection, and the company’s rapid growth to 15 million users. Additionally, Christian shares insights into future product innovations, the importance of corporate culture, and effective growth hacking techniques. The conversation also touches on the societal implications of AI, especially in terms of content creation and detection across different languages and regions. 00:00 Introduction to AI Innovators Podcast 00:14 Christian Perry's Journey into AI 00:50 Challenges with AI Detection 02:00 Undetectable's Market Impact 04:20 AI Detection vs. Humanization 08:26 Growing to 15 Million users 12:31 Future of AI and Undetectable 21:02 Maintaining Company Culture at a Startup 24:18 Lightning Round: Quick Insights 27:08 Conclusion #ai #aicontent #podcast #aipodcast #startup #cloudcomputing #artificialintelligence #aidetector #llm
Show more...
8 months ago
27 minutes 35 seconds

AI Innovators - By SaladCloud
Ep 11 - CTO's perspective on cost reduction & globalization in AI - Mohamed Rashad - Hyperion AI
This podcast is brought to you by SaladCloud: www.salad.com. In this episode of the AI Innovators Podcast, we are joined by Mohamed Rashad, the CTO of Hyperion AI. Mohamed shares his journey from his initial steps in AI to co-founding various AI startups and working with major companies like Nokia. He discusses the challenges startups face in bringing AI products to market and how Hyperion AI helps bridge the gap from proof of concept to production. The conversation also touches on the rise of localized AI technologies, the trend of solopreneurs, and the role of AI in leveling the global playing field. Mohamed delves into the importance of sustainability and risk management for AI startups, the potential of no-code tools, and shares his thoughts on distributed cloud solutions like Salad. 00:00 Introduction to the AI Innovators Podcast 00:32 Mohamed Rashad's Career Journey 02:51 Bridging the Gap in AI Development 05:23 Global AI Hubs and Localized Technologies 08:48 The Rise of Solopreneurs and One-Man Startups 11:20 Challenges in AI Adoption and Production 29:26 The Future of No-Code and Low-Code Tools 33:45 Closing Thoughts and Environmental Impact
Show more...
8 months ago
36 minutes 22 seconds

AI Innovators - By SaladCloud
AI Innovators - Ep 10 - James Verbus, Sr. Staff Software Engineer, Machine Learning at LinkedIn
In this episode of the AI Innovators Podcast by SaladCloud (www.salad.com), James Verbus, a Sr. Staff Machine Learning Engineer at LinkedIn, shares his journey from particle astrophysics to machine learning at LinkedIn. He discusses identifying abuse on social media platforms, the ethical challenges of AI-generated content, the importance of trust and explainability in AI systems, and the ongoing arms race between AI detection systems and bad actors. Verbus also emphasizes the balance between effective detection and maintaining a positive user experience, as well as offering valuable advice for aspiring machine learning engineers on how to advance their careers. Takeaways Transitioning from astrophysics to machine learning for broader opportunities. Detecting abuse on social media parallels searching for dark matter. LinkedIn's trust team addresses various types of abuse and trust issues. AI can be used for both trust and abuse in social media. Explainability in AI is crucial for user trust. AI tools can enhance creativity and content creation. The landscape of AI is heterogeneous and requires customization. Bad actors quickly adopt new technologies for scams. AI-generated content raises ethical concerns and user distrust. Robust detection systems are essential in the arms race against AI abuse. - - Models can detect AI-generated photos with high accuracy. Continuous model updates are essential for effective detection. Bad actors exploit AI-generated images for fake accounts. Training data quality and volume are critical for model performance. Detection models operate at various stages of user activity. Balancing detection and user experience is crucial. AI engineers should focus on solving real business problems. Effective communication of results is key for career advancement. Being T-shaped in skills can enhance impact in teams. A holistic approach to abuse prevention increases effectiveness. This podcast is brought to you by SaladCloud: www.salad.com
Show more...
9 months ago
44 minutes 37 seconds

AI Innovators - By SaladCloud
Transcribing 20 Million hrs/month for 95% less - Ep 9 - Petros Syntelis - Virgin Media O2
From Astrophysics to AI Transcription: Petros Sintelis' Journey with Virgin Media O2 In this episode of the AI Innovatives Podcast by Sala Technologies, host Prashanth interviews Petros Sintelis, Principal Data Scientist at Virgin Media O2. Petros shares his transition from academia in solar physics to the machine learning space at Virgin Media O2, highlighting the reasons behind his career shift and the challenges faced. They discuss the role of data science in telecommunications, including personalization, pricing, operations, and network expansion. Petros dives into the practicalities and cost-effectiveness of deploying open-source models like Whisper for transcription services and emphasizes the importance of team structure and cost considerations in machine learning projects. The conversation also touches on emerging trends in AI and the impact of distributed cloud solutions on enterprises and startups. 00:00 Introduction and Guest Welcome 00:28 Petros Syntelis' Background and Career Journey 01:31 Transition from Astrophysics to Data Science 05:57 Role and Responsibilities at Virgin Media O2 06:40 Team Structure and Project Management 09:06 Applications of Data Science in Telecommunications 13:03 Challenges and Innovations in Call Center Transcription 19:08 Cost Analysis and Benefits of Self-Hosted Whisper 24:46 Technical Challenges and Infrastructure 32:14 Emerging Trends and Future of AI 38:55 Closing
Show more...
10 months ago
39 minutes 19 seconds

AI Innovators - By SaladCloud
Evolution of AI in Marketing - Sudha Reddy - Rava AI | AI Innovators Podcast | Ep #8 | Salad
In this episode of the AI Innovators podcast, Sudha Reddy, CMO and co-founder of Rava AI, discusses the evolution of AI in marketing, the challenges faced by marketers in a saturated tool landscape, and the importance of adapting marketing strategies in the age of AI. She also discusses the need for startups to focus on distribution and branding rather than just technology, and explores the future of AI tools in content creation and lead generation.   This podcast is brought to you by Salad, the world's largest distributed cloud. Learn more: https://salad.com/ Try SaladCloud today: https://portal.salad.com Takeaways - Rava AI as an AI marketing platform that manages content generation. - The competition in AI marketing tools is intense and ever-evolving. - Marketers must adapt to new tools and strategies to stay relevant. - AI has democratized content generation, making it accessible to all. - Understanding target audiences is crucial for effective marketing. - Distribution and branding are now key differentiators in startups. - AI tools are increasingly capable of generating human-like content. - Marketers need to experiment with various channels to reach audiences. - The landscape of marketing is shifting from text to video and other formats. - AI can assist in content creation, but human insight remains essential.  
Show more...
10 months ago
23 minutes 59 seconds

AI Innovators - By SaladCloud
Ep 7 - AI image generation & the startup journey - Rohit Rao - Segmind
In this episode of the AI Innovators Podcast, Rohit Rao, CEO of Segmind, discusses the journey of his company from its inception to its current role in democratizing generative AI for marketers. He shares insights on the transition from medical AI to generative AI, the challenges faced in building a startup, and the importance of open source in gaining initial traction. Rohit also explores the future of AI text to image generation, the ethical considerations surrounding AI, and the sustainability of AI technologies. This podcast is brought to you by Salad, the world's largest distributed cloud. Learn more: https://salad.com/ Try SaladCloud today: https://portal.salad.com Takeaways - SegMind simplifies the use of generative AI models for marketers. - The transition from medical AI to generative AI offers faster gratification. - Founding a startup today is easier for smaller teams but harder in competition. - Setbacks are common in the startup journey and require resilience. - Open source projects can effectively build initial user bases. - The launch of SegMOE aims to make AI models more accessible. - Future innovations in AI will focus on multimodal capabilities. - AI image generation is becoming more sustainable and efficient. - Ethical considerations in AI are crucial for responsible development. - The landscape of AI is rapidly evolving, requiring constant adaptation.
Show more...
1 year ago
29 minutes 16 seconds

AI Innovators - By SaladCloud
Ep 6 - AI adoption & procurement - Henry Stanley - Fabrik
In this episode, Henry Stanley, Chief Product Officer and Co-Founder of Fabrik, discusses the challenges of AI adoption and procurement in the B2B trust ecosystem. This podcast is brought to you by Salad, the world's largest distributed cloud. Learn more: https://salad.com/ Try SaladCloud today: https://portal.salad.com Takeaways - AI adoption and procurement in the B2B trust ecosystem present challenges related to security, compliance, and trust. - Compliance is a risk function that helps organizations manage risk and build trust with customers. AI introduces new risks and frameworks for governance and compliance. - Tools and solutions in the compliance and security ecosystem, such as Vanta and Creo AI, are emerging to address AI-specific governance and controls. - Fabrik aims to help companies proactively demonstrate their security and compliance posture and build trust with customers. Chapters 00:00: Challenges of AI Adoption and Procurement 03:24: The Role of Compliance in Risk Management 06:14: New Risks and Frameworks for AI Governance 08:05: Emerging Tools and Solutions in Compliance and Security 10:59: Proactively Demonstrating Security and Compliance
Show more...
1 year ago
16 minutes 24 seconds

AI Innovators - By SaladCloud
Ep 5 - AI for easier tax filing - Daniel Marcous - April
In this episode, we interview Daniel, a CTO who solved two of the most frustrating problems for everyday people - taxes & traffic. Daniel Marcous is the CTO of April (AI tax-prep software) and ex-CTO of Waze (popular crowdsourced traffic app). This podcast is brought to you by Salad, the world's largest distributed cloud. Learn more: https://salad.com/ Try SaladCloud today: https://portal.salad.com Sound bites: "So we don't only meet you once a year in April when you need to file your taxes where it's far too late to actually do something impactful, but we accompany you throughout the year in order to save you the big bucks and actually optimize your taxes" "Taxes need to be 100 % accurate. You can't make mistake one every 100 users. Just not doable. So there's tons of guardrails that we've put basically everywhere so it will be accurate " "This role called data scientist is going to slowly shrink into becoming something probably not going to remain in existence for longer the way we know it, as the power is being shifted to engineers really, really fast in everything that has to do in AI" Takeaways April is an AI-powered tax preparation company that offers tax solutions for financial institutions and individuals. The use of AI in tax optimization can help individuals save time and money by automating tax planning, tax optimization, and tax filing. Bridging the gap between tax knowledge and technical expertise is crucial in building AI-powered tax software. The role of data scientists is evolving, and engineers are becoming more involved in AI development. Cloud costs can be managed by understanding the trade-off between business outcomes and costs and optimizing financial operations. Choosing an alternate cloud provider should be based on the specific needs of the business and the ability to solve product requirements. The entry barriers for entering tech are lower, and formal education is becoming less important in the industry. Daniel's favorite tech gadget is a tag locator, and his favorite way to unwind is by watching anime TV shows.
Show more...
1 year ago
37 minutes 40 seconds

AI Innovators - By SaladCloud
Ep 4: AI tools, cost & examples of video accessibility done right - Michelle Inaba
SUMMARY Michelle Inaba, a product manager, discusses the importance of making video content globally accessible.  This includes providing options such as automatic captions, subtitles, and dubbing in multiple languages. Companies are using strategies like automatic speech recognition and dubbing to make their video content accessible. The future of video accessibility could include AI real-time dubbing and improved automatic captions. While AI is efficient and cost-effective, human involvement is still essential for tasks that require nuance and accuracy. It is best to use a hybrid approach, combining AI solutions with human verification. The cost of video accessibility with AI varies depending on factors like content length and complexity. However, the upfront expenses are worth it for the increased audience reach and engagement. Video accessibility positively impacts business metrics like revenue and brand recognition by expanding global reach and attracting diverse audiences.  There are several AI tools and services available for video accessibility, including Papercup AI, Dubbing AI, Rev.ai, and Google Cloud Translation. Michelle also provides many real-life examples including Mr. Beast on how accessibility can increase revenue for creators and businesses alike.  Takeaways Video accessibility means making content globally accessible through options like automatic captions, subtitles, and dubbing in multiple languages. Companies use strategies like automatic speech recognition and dubbing to make their video content accessible. The future of video accessibility could include AI real-time dubbing and improved automatic captions. While AI is efficient and cost-effective, human involvement is essential for tasks that require nuance and accuracy. A hybrid approach, combining AI solutions with human verification, is recommended for video accessibility. Video accessibility positively impacts business metrics like revenue and brand recognition by expanding global reach and attracting diverse audiences. AI tools and services like Papercuts AI, Dublin AI, Rev.ai, and Google Cloud Translation are available for video accessibility. Sound Bites "A good example here is, in my opinion is Mr. Beast. He has his channel in several different languages. And he does that by using dubbing. And by doing that, he is making his content available to so many different people from so many different countries, so many different languages"   "In the case of a streamer or any other individual that's like a video content creator, YouTuber or whatever, expanding your markets internationally allows you to have more interest, to be more interested in product placements in your videos. And you become more attractive for brands partnerships" "In the future, we could expect to see some AI real-time dubbing and an improvement on the automatic captions." Chapters 00:00 Introduction to Video Accessibility01:35 Strategies for Making Video Content Accessible02:02 The Future of Video Accessibility03:30 The Role of Humans in Video Accessibility07:20 The Cost of Video Accessibility with AI15:49 The Business Impact of Video Accessibility18:49 AI Tools for Video Accessibility        
Show more...
1 year ago
21 minutes 44 seconds

AI Innovators - By SaladCloud
Ep 3: AI applications in the enterprise - Chip Ernst from Roli.ai
Chip Ernst, CEO of Roli.ai, discusses the challenges of integrating AI into enterprise applications and the need for easy and robust cloud engineering solutions. He emphasizes the importance of considering the complexity of enterprise environments and the need for validation and control when using AI models. Chip provides examples of use cases, such as AI-supported responses for automotive service organizations and doctor case note expansion, where human validation is crucial. He also highlights the need for flexibility and adaptability in AI solutions as technologies and models evolve. Takeaways Integrating AI into enterprise applications requires robust cloud engineering solutions. Validation and control are essential when using AI models, and human involvement is necessary to ensure accuracy and accountability. Use cases for AI in enterprise include AI-supported responses for automotive service organizations and doctor case note expansion. Flexibility and adaptability are crucial in AI solutions as technologies and models evolve. Sound Bites "Can you do the same old thing smarter with AI?" "AI technologies are easily accessible, so accessible that every single one of us could sit down at our laptop and query with a prompt." "The AI is new, but the idea of adding intelligence or some sort of clever process to a business application is not." Learn more about Salad: https://salad.com Try SaladCloud today: https://portal.salad.com  
Show more...
1 year ago
20 minutes 49 seconds

AI Innovators - By SaladCloud
Ep 2: ASR models, accuracy, cost & the role of humans - Aleks Smechov from Wordcab
In this conversation, Derick Thompson from Salad Technologies interviews Alex from WordCab about transcription, ASR, and accessibility. They discuss the importance of accurate transcripts for global accessibility, the different definitions of verbatim transcription, and the impact of audio cues. They also talk about the best ASR models, tools for post-processing, and the need for human editors in transcription. The conversation concludes with a discussion on the future of ASR and transcription.   Takeaways Accurate transcripts are crucial for global accessibility, allowing people with disabilities to understand audio and video content. Different definitions of verbatim transcription exist, ranging from including all disfluencies to a more cleaned-up version. Audio cues, such as laughter or coughing, are important for accessibility and may need to be added during transcription. The best ASR models for transcription depend on the specific use case and language requirements. Post-processing is essential for improving transcript accuracy, especially for industry-specific terms and difficult words. Human editors play a vital role in fine-tuning transcripts and adding value through post-processing and audio cues. The future of ASR and transcription lies in increasing accuracy, reducing word error rates, and focusing on post-processing capabilities. Transcription will become a commodity, and the real value will come from what can be done with the transcript after transcription. Using cost-effective GPU instances and cloud-agnostic tools is important for hosting ASR models. The goal is to provide reliable and affordable transcription services to meet the needs of different use cases. Sound Bites "Accessibility in terms of video and audio, captions and transcription in general, is making sure that people who have some sort of disability, maybe they're hard of hearing or deaf, are still able to understand the captions or subtitles or transcript as well as someone who could hear." "Transcript editing will always be there as a kind of a last mile thing for edge cases and there will always be edge cases." "Transcription will become a commodity or table stakes like, you'll have to have excellent transcription, 95% accuracy, et cetera, in the future. And the real value will come in with what you could do after." Chapters 00:00: Introduction and Overview of WordCab 01:14: Defining Verbatim Transcription and Audio Cues 07:03: Choosing the Best ASR Models for Transcription 09:26: The Importance of Post-Processing in Transcription 12:51: Accuracy, Word Error Rate, and Transcription 14:17: Tools and Approaches for ASR and Transcription 19:43: The Future of ASR and Transcription 21:08: Optimizing ASR Performance and Cost 22:07: Providing Reliable and Affordable Transcription Services  
Show more...
1 year ago
20 minutes 17 seconds

AI Innovators - By SaladCloud
Ep 1: How AI dubbing is making videos globally accessible - Doniyor Ulmasov from PaperCut
Summary In this conversation, Doniyor Ulmasov, Head of Engineering at PaperCut, discusses the process of making videos globally accessible through AI dubbing and localization. He explains the differences between captions, subtitles, and dubs, and how dubbing involves adapting the source content to the target audience. Doniyor also shares insights into the multi-step process of dubbing, including transcription, translation, and text-to-speech models. He highlights the importance of human validation in maintaining quality and discusses the challenges of expanding beyond English. The conversation concludes with a discussion on the cost-effectiveness of dubbing and the potential for PaperCut to become a global dubbing solution.   Takeaways Video accessibility involves making videos globally accessible in multiple languages. Dubbing is the process of adapting the source content to the target audience. The dubbing process includes transcription, translation, and text-to-speech models. Human validation is crucial for maintaining quality in dubbing.Expanding beyond English poses challenges in accuracy and pipeline management. Dubbing can be a cost-effective solution compared to traditional dubbing houses. PaperCut aspires to become a global dubbing solution for video accessibility. Sound Bites   "A video is globally accessible when it can reach as many people as possible and as many languages as possible." "If you do a literal translation, you're going to lose the joke, right? That's why it's called adaptation, not translation." "Once we achieve the translation layer, then we move to the text-to-speech model."   Chapters 00:00 Introduction and Background 01:19 Caption, Subtitle, and Dubbing Differences 03:05 Text-to-Speech and Voice Assignment 05:03 Serverless GPU Options for Cost Optimization 08:18 Recommended Open Source Models 10:37 Challenges in Expanding Beyond English 11:06 Human Validation in Maintaining Quality 12:04 The Cost-Effectiveness of Dubbing 12:57 PaperCut's Aspiration as a Global Dubbing Solution  
Show more...
1 year ago
28 minutes 40 seconds

AI Innovators - By SaladCloud