Home
Categories
EXPLORE
True Crime
Comedy
Society & Culture
Business
Sports
Technology
Health & Fitness
About Us
Contact Us
Copyright
© 2024 PodJoint
Podjoint Logo
US
00:00 / 00:00
Sign in

or

Don't have an account?
Sign up
Forgot password
https://is1-ssl.mzstatic.com/image/thumb/Podcasts221/v4/bb/35/0e/bb350e05-8dc7-1c53-9c51-653a43d59a04/mza_4772723504528312463.jpg/600x600bb.jpg
Building the Backend: Data Solutions that Power Leading Organizations
Travis Lawrence
43 episodes
9 months ago
In this episode we speak with Justin Borgman, Chairman & CEO at Starburst, which is based on open source Trino (formerly PrestoSQL) and was recently valued at $3.35 billion after securing their series D funding. In this episode we discuss convergence of DW’s / DL's, why data lakes fail and much much more. Top 3 takeawaysThe data mesh architecture is gaining adoption more quickly in Europe due to GDPR.There were two main limitations of data lakes when comparing to DW’s, perf...
Show more...
Technology
Education,
How To
RSS
All content for Building the Backend: Data Solutions that Power Leading Organizations is the property of Travis Lawrence and is served directly from their servers with no modification, redirects, or rehosting. The podcast is not affiliated with or endorsed by Podjoint in any way.
In this episode we speak with Justin Borgman, Chairman & CEO at Starburst, which is based on open source Trino (formerly PrestoSQL) and was recently valued at $3.35 billion after securing their series D funding. In this episode we discuss convergence of DW’s / DL's, why data lakes fail and much much more. Top 3 takeawaysThe data mesh architecture is gaining adoption more quickly in Europe due to GDPR.There were two main limitations of data lakes when comparing to DW’s, perf...
Show more...
Technology
Education,
How To
Episodes (20/43)
Building the Backend: Data Solutions that Power Leading Organizations
The Analytics Engine for All Your Data with Justin Borgman @ Starburst
In this episode we speak with Justin Borgman, Chairman & CEO at Starburst, which is based on open source Trino (formerly PrestoSQL) and was recently valued at $3.35 billion after securing their series D funding. In this episode we discuss convergence of DW’s / DL's, why data lakes fail and much much more. Top 3 takeawaysThe data mesh architecture is gaining adoption more quickly in Europe due to GDPR.There were two main limitations of data lakes when comparing to DW’s, perf...
Show more...
3 years ago
36 minutes

Building the Backend: Data Solutions that Power Leading Organizations
Transform Your Object Storage Into a Git-like Repository With Paul Singman @ LakeFS
In this episode we speak with Paul Singman Developer Advocate at Treeverse / LakeFS. LakeFS is an open source project that allows you to transform your object storage into a Git-like repository. Top 3 takeawaysLakeFS enables use cases like debugging to quickly view historical versions of your data at a specific point in time and running ML experiments over the same set of data with branching..The current data landscape is very fragmented with many tools available.. Over the coming ...
Show more...
3 years ago
27 minutes

Building the Backend: Data Solutions that Power Leading Organizations
Enable Faster Data Processing and Access with Apache Arrow with Matt Topol @ Factset
In this episode we speak with Matt Topol, Vice President, Principal Software Architect @ FactSet and dive deep into how they are taking advantage of Apache Arrow for faster processing and data access. Below are the top 3 value bombs:Apache Arrow is an open-source in-memory columnar format that creates a standard way to share and process data structures.Apache Arrow Flight eliminates serialization and deserialization which enables faster access to query results compared to traditional JDB...
Show more...
3 years ago
49 minutes

Building the Backend: Data Solutions that Power Leading Organizations
Implementing Amundsen @ Convoy with Chad Sanderson
In this episode we speak with Chad Sanderson head of data and early stage startup advisor focused on data innovation @ Convoy and uncover their journey to implementing Amundsen, an open source data catalog.Below are the top 3 value bombs: Data Scientist’s should not be spending the majority of their time trying to find the data they are interested in. Amundsen is a powerful open source data catalog that integrates across your data landscape to provide visibility into your da...
Show more...
3 years ago
35 minutes

Building the Backend: Data Solutions that Power Leading Organizations
The Importance of Treating Your Data Initiatives as Products with Murali Bhogavalli
Your data team should not just be keeping the lights on, but should be building and creating data products to support the business. In this episode we speak with Murali Bhogavalli a data product manager and explore what is a data product manager and how they differ from a traditional product manager. Below are the top 3 value bombs: Data should be looked at as a product and treated as such within the organization (i.e. agile methodologies, continuous improvement…)Organizations nee...
Show more...
3 years ago
26 minutes

Building the Backend: Data Solutions that Power Leading Organizations
Open-Source Data Catalog Amundsen with Mark Grover @ Stemma
In this episode of Building The Backend we hear from Mark Grover founder @ Stemma, co-creator of Amundsen. Stemma is a fully managed data catalog, powered by the leading open-source data catalog, Amundsen.Below are top 3 value bombs: Automated data catalogs are critical to help wrangle the growing data across organizations. (i.e. Being able to identify out of 150 columns on this table only 10 are being used downstream)Tribal knowledge and context cannot be automated - data catalogs canno...
Show more...
3 years ago
41 minutes

Building the Backend: Data Solutions that Power Leading Organizations
Architecting a Modern Data Lake with Dipti Borkar from Ahana
In this episode of Building The Backend we hear from Dipti Borkar cofounder @ Ahana a managed service for Presto on AWS, where we talk all about the data lake, how it should be structured and where the industry is going. Below are top 3 value bombs: Presto is an open source distributed SQL query engine originally created by Facebook, mainly used to run SQL queries on data lakes but can be connected to relational data stores as well. Ahana is a managed Presto service on AWS wit...
Show more...
3 years ago
39 minutes

Building the Backend: Data Solutions that Power Leading Organizations
Open Source BI with Apache Superset
What tools are you using for data viz? Are they low cost? One option is Apache Superset, in this episode we speak with Robert Stolz to learn more about Superset and other open source data tools. Top 3 Value Bombs: One popular use case with Apache Superset is embedding it within applications because it’s open source, there is a wide range of flexibility to integrate it with existing systems. Apache Superset supports any sources supported by the Python SQL toolkit called SQLAl...
Show more...
4 years ago
29 minutes

Building the Backend: Data Solutions that Power Leading Organizations
Edge Computing and Continuous Intelligence with Swim
In this episode of Building The Backend we hear from Simon Crosby – CTO @ Swim an open source edge computing operating system, where we talk all about edge computing, event streaming and much more. Below are top 3 value bombs: Edge means more than just being physically located somewhere it could also mean in the cloud. It really is the closest point of where your source data is being generated.Continuous intelligence is a design pattern where streaming data is directly tied into b...
Show more...
4 years ago
34 minutes

Building the Backend: Data Solutions that Power Leading Organizations
12 Modern Data Architecture Principles That Should Be Implemented in 2022
This episode is a little different then the usual format. Instead of interviewing a data leader - I share what I consider are the 12 most important principles when designing a modern data architecture. Please message me on LinkedIn with the thoughts on this show.
Show more...
4 years ago
20 minutes

Building the Backend: Data Solutions that Power Leading Organizations
The Keys to Good Data Quality With Prukalpa Sankar from Atlan
In this episode of Building The Backend we hear from Prukalpa Sankar – Co-founder of Atlan, where we talk all about data quality/governance, common issues organizations face when implementing data quality and much much more. Below are top 3 value bombs: Data Governance has a bad reputation. It should not be a bureaucratic controlling process that is pushed from the top down. Active Metadata is key to modern data architectures, essentially it’s putting together all the human and ...
Show more...
4 years ago
37 minutes

Building the Backend: Data Solutions that Power Leading Organizations
Designing a Modern Data Architecture – Teradata
This is a podcast episode you do not want to miss with Stephen Brobst, CTO @ Teradata. We discuss all things Data Warehouses, the shift to the distributed cloud and, key principles to implementing successful DW's. Top 3 Value Bombs: Large organizations are shifting more to a distributed / inter-cloud architecture for many reasons, a couple of reasons are data sovereignty, increasing residency and reducing costs.Just because your DW does not support indexing does not mean you do not need...
Show more...
4 years ago
44 minutes

Building the Backend: Data Solutions that Power Leading Organizations
Exploring Open-Source Data Integration With Airbyte
“The hardest part of ETL is not building the connectors, it is maintaining them.” Truer words never spoken. Really enjoyed this episode with Michel Tricot CEO & Co-Founder of Airbyte where we discuss all things data integration and connectors. Top 3 value bombs: The future of ETL/ELT integration connectors may lie with open source. Many closed source data integration tools only create connectors if the ROI is there, but this leaves many tools out and speed to market can be slow. Airby...
Show more...
4 years ago
35 minutes

Building the Backend: Data Solutions that Power Leading Organizations
How To Effectively Reduce Data Quality Incidents 10x with Datafold
This episode features Gleb Mezhanskiy Co-Founder & CEO @ Datafold, during our discussion we talk all about data observability and how to improve your data quality. Before Datafold, Gleb was a founding member of data teams at Lyft and Autodesk, where he built sophisticated data platforms and developed tooling to improve productivity and data quality.Top 3 Value Bombs:The foundation of any data observability platform is the data catalog. Data observability becomes increasingly difficul...
Show more...
4 years ago
39 minutes

Building the Backend: Data Solutions that Power Leading Organizations
Applying Transformations to Streaming Data with Materialize
This episode features Arjun Narayan Co-Founder & CEO @ Materialize, during our discussion we talk all about transforming streaming data, the do’s the don’ts and how Materialize is changing the landscape of streaming. Top 3 Value Bombs:When creating schema changes organizations should always strive to create forward compatible schema changes only. This means consumers will be able to consume your data model without impacting them, they just may be missing your newly added column.M...
Show more...
4 years ago
32 minutes

Building the Backend: Data Solutions that Power Leading Organizations
Optimizing Spark in the Cloud - with Jean-Yves Stephan
This episode features Jean-Yves Stephan Co-Founder & CEO @ Data Mechanics (recently Acq. by Spot by NetApp), during our discussion we talk about optimizing Spark to run in the cloud at a low cost.Top 3 Value Bombs:Running Spark CAN be expensive but there are ways to reduce your current operating costs by 50-75% by smart automations (i.e. tune for node type, memory and CPU). Spot instances can lower your costs by utilizing unused instances. Creating serverless architectures...
Show more...
4 years ago
32 minutes

Building the Backend: Data Solutions that Power Leading Organizations
How To Achieve Better Observability and Control Over Your Data Pipelines with Josh Benamram
This episode features Josh Benamrum, who is the co-founder of Databand. Databand is a company that helps engineering teams achieve better observability and control over their tech stack.Top 3 Value Bombs: When observing our data we should be looking at our data and pipelinesDon’t wait till the board meeting for an incorrect metric to make DQ a priorityHaving clear SLA’s on just what data quality means across the organization is essential
Show more...
4 years ago
37 minutes

Building the Backend: Data Solutions that Power Leading Organizations
Unify Your Data Operations with Nexla
Travis welcomes to his podcast Saket Saurabh, who provides a window into the world of data management and the self-service options that are democratizing it. Co-founder and CEO of Nexla, Saket has a passion for data and infrastructure and how to improve its flow among partners, customers and vendors. Nexla automates various data engineering tasks, intelligently creates an abstraction of data and enables collaboration among people at different skill levels. Named a 2021 Cool Vendor by Gartner,...
Show more...
4 years ago
25 minutes

Building the Backend: Data Solutions that Power Leading Organizations
A Powerful Open Source Database That Supports Many Storage Needs (MariaDB)
In this episode, we speak with Rob Hedgpeth, a director of developer developer relations at Maria DB. We explore all things Maria DB, the capabilities it has and when you should consider it for your next project. Top 3 value bombs:MariaDB follows a shared nothing architecture and supports distributed SQL for unlimited scaling on demand.MariaDB can handle many types of storage (i.e. document store, graph and spatial)When deciding on your next relational database do not just l...
Show more...
4 years ago
27 minutes

Building the Backend: Data Solutions that Power Leading Organizations
Increase the Quality and Reliability of Your Data
In this episode, we speak with Lior Gavish, the co-founder of Monte Carlo to explore all things data quality. Monte Carlo is a data lineage and observability tool that lowers your data downtime.Top 3 Value Bombs:Data products should be thought of in it’s entirely from the source to the consumer.No one data stakeholder can solve data quality issues, it’s a collaboration of the data engineers, business, data consumer and even software to help automate certain aspects of cataloging and captu...
Show more...
4 years ago
31 minutes

Building the Backend: Data Solutions that Power Leading Organizations
In this episode we speak with Justin Borgman, Chairman & CEO at Starburst, which is based on open source Trino (formerly PrestoSQL) and was recently valued at $3.35 billion after securing their series D funding. In this episode we discuss convergence of DW’s / DL's, why data lakes fail and much much more. Top 3 takeawaysThe data mesh architecture is gaining adoption more quickly in Europe due to GDPR.There were two main limitations of data lakes when comparing to DW’s, perf...