Wednesday, March 5, 2025

DeepSeek Smallpond - A lightweight data processing framework built on DuckDB and 3FS

Features

🚀 High-performance data processing powered by DuckDB

🌍 Scalable to handle PB-scale datasets

🛠️ Easy operations with no long-running services 


Performance

We evaluated smallpond using the GraySort benchmark (script) on a cluster comprising 50 compute nodes and 25 storage nodes running 3FS. The benchmark sorted 110.5TiB of data in 30 minutes and 14 seconds, achieving an average throughput of 3.66TiB/min.


( DuckDB is an open-source column-oriented relational database management system (RDBMS). It is designed to provide high performance on complex queries against large databases in embedded configuration, such as combining tables with hundreds of columns and billions of rows. Unlike other embedded databases (for example, SQLite) DuckDB is not focusing on transactional (OLTP) applications and instead is specialized for online analytical processing (OLAP) workloads. )


Smallpond provides both high-level and low-level APIs

Currently, smallpond provides two different APIs, supporting dynamic and static construction of data flow graphs respectively. Due to historical reasons, these two APIs use different scheduler backends and support different configuration options.

1. The High-level API currently uses Ray as the backend, supporting dynamic construction and execution of data flow graphs.

2. The Low-level API uses a built-in scheduler and only supports one-time execution of static data flow graphs. However, it offers more performance optimizations and richer configuration options.

We are working to merge them so that in the future, you can use a unified high-level API and freely choose between Ray or the built-in scheduler.


https://github.com/deepseek-ai/smallpond


Generative AI, Robot Operating System (ROS 2), Computer Vision, Natural Language Processing service, Generative AI Chatbot, Machine Learning, Mobile App, Web App? Yes, I do provide!


Call me: +84854147015

WhatsApp: +601151992689

https://amatasiam.web.app

Email: ThomasTrungVo@Gmail.Com

Facebook: 
https://www.facebook.com/voduytrung

X: 
https://x.com/ThomasTrung





No comments:

Post a Comment