Home
Categories
EXPLORE
True Crime
Comedy
Society & Culture
Business
News
Sports
TV & Film
About Us
Contact Us
Copyright
© 2024 PodJoint
Podjoint Logo
US
00:00 / 00:00
Sign in

or

Don't have an account?
Sign up
Forgot password
https://is1-ssl.mzstatic.com/image/thumb/Podcasts124/v4/9a/af/5f/9aaf5fe4-6244-11b2-7a08-0305e5e4ccee/mza_16956068748596327072.jpg/600x600bb.jpg
Storage Developer Conference
SNIA Technical Council
202 episodes
7 months ago
Every week the Storage Developer Conference (SDC) podcast presents important technical topics to the Storage Developer community. Each episode is hand selected by the SNIA Technical Council from the presentations at our annual Storage Developer Conference. The link to the slides is available in the show notes at www.snia.org/podcasts.
Show more...
Technology
RSS
All content for Storage Developer Conference is the property of SNIA Technical Council and is served directly from their servers with no modification, redirects, or rehosting. The podcast is not affiliated with or endorsed by Podjoint in any way.
Every week the Storage Developer Conference (SDC) podcast presents important technical topics to the Storage Developer community. Each episode is hand selected by the SNIA Technical Council from the presentations at our annual Storage Developer Conference. The link to the slides is available in the show notes at www.snia.org/podcasts.
Show more...
Technology
https://is1-ssl.mzstatic.com/image/thumb/Podcasts124/v4/9a/af/5f/9aaf5fe4-6244-11b2-7a08-0305e5e4ccee/mza_16956068748596327072.jpg/600x600bb.jpg
#190: Kinetic Campaign: Speeding Up Scientific Data Analytics with Computational Storage
Storage Developer Conference
48 minutes 53 seconds
2 years ago
#190: Kinetic Campaign: Speeding Up Scientific Data Analytics with Computational Storage
Large-scale data analytics, machine learning, and big data applications often require the storage of a massive amount of data. For cost-effective high bandwidth, many data centers have used tiered storage with warmer tiers made of flashes or persistent memory modules and cooler tiers provisioned with high-density rotational drives. While ultra fast data insertion and retrieval rates have been increasingly demonstrated by research communities and industry at warm storage, complex queries with predicates on multiple columns tend to still experience excessive delays when unordered, unindexed (or potentially only lightly indexed) data written in log-structured formats for high write bandwidth is subsequently read for ad-hoc analysis at row level. Queries run slowly because an entire dataset may have to be scanned in the absence of a full set of indexes on all columns. In the worst case, significant delays are experienced even when data is read from warm storage. A user sees even higher delays when data must be streamed from cool storage before analysis takes place. In this presentation, we present C2, a research collaboration between Seagate and Los Alamos National Lab (LANL) for the lab's next-generation campaign storage. Campaign is a scalable cool storage tier at LANL managed by MarFS that currently provides 60 PBs of storage space for longer-term data storage. Cost-effective data protection is done through multi-level erasure coding at both node level and rack level. To prevent users from always having to read back all data for complex queries, C2 enables direct data analytics at the storage layer by leveraging Seagate Kinetic Drives to asynchronously add indexes to data at per-drive level after data lands on the drives. Asynchronously constructed indexes cover all data columns and are read at query time by the drives to drastically reduce the amount of data that needs to be sent back to the querying client for result aggregation. Combining computational storage technologies with erasure coding based data protection schemes for rapid data analytics over cool storage presents unique challenges in which individual drives may not be able to see complete data records and may not deliver performance required by high-level data insertion, access, and protection workflows. We discuss those challenges in the talk, share our designs, and report early results.
Storage Developer Conference
Every week the Storage Developer Conference (SDC) podcast presents important technical topics to the Storage Developer community. Each episode is hand selected by the SNIA Technical Council from the presentations at our annual Storage Developer Conference. The link to the slides is available in the show notes at www.snia.org/podcasts.