duckdb news, how-tos and comparisions

DuckDB vs ClickHouse performance comparison for structured data serialization and in-memory TPC-DS queries execution
2025-02-11
This blog post discusses the use of Azure Functions and related tools for data processing tasks. The author shares his experience with using Azure Functions over runbooks and Pandas for CSV to Parquet transformations.
Announcing DuckDB 1.2.0
2025-02-05
DuckDB 1.2.0 release includes various improvements such as CLI safe mode, friendly SQL features like prefix aliases and RENAME clause in SELECT, optimizations in the optimizer, a new C API for extensions, support for musl libraries, and many other enhancements. These updates improve DuckDB's functionality and usability while maintaining compatibility with different platforms.
Running SQL on files with the esProc is very convenient, on par with duckDB
2025-01-27
Judy discusses the convenience and power of esProc for handling complex data processing tasks. She highlights how SPL (Script Processing Language) can simplify queries on various file types such as CSV, JSON, Excel, and more. The discussion includes examples of step-by-step SQL-like commands and the use of SPL's unique syntax. Judy also mentions that while esProc's SQL is limited to a subset of SQL92, it excels in order-related calculations. She notes that esProc can be integrated into applications as an embedded database using its JDBC driver.
A Duck Takes Flight
2025-01-25
This article discusses how Mike Ritchie at Definite implemented a solution using DuckDB and Arrow Flight for streaming data in near real-time analytics. The solution addresses the concurrency limitations of DuckDB by leveraging Arrow Flight to allow multiple writers and readers simultaneously.
Access Databricks UnityCatalog from duckdb
2025-01-20
Dieser Artikel zeigt, wie man Machine-Learning-Modelle schnell und effizient bewerten kann. Es erklärt die Anwendung von Quality Gates für den Automatisierung der Bewertung und bietet praktische Beispiele.
Using DuckDB With Apache Supserset, Bonus Spatial Data
2025-01-15
Integrating DuckDB with Apache Superset for Spatial Data Analysis
Blending DuckDB and Iceberg for Optimal Cloud OLAP
2025-01-13
This post explains how to combine DuckDB and data catalogs like Apache Iceberg for fast and flexible data processing. It describes the cache mechanism used to save intermediate results, improve performance, and avoid redundant computations.
SEPARATING STORAGE AND COMPUTE IN DUCKDB
2024-12-17
MotherDuck uses DuckDB to separate storage and compute for handling large datasets and enabling new data architectures. This allows them to decouple workloads from physical machines.
Querying Snowflake Managed Iceberg Tables with DuckDB
2024-12-12
This guide walks through querying Snowflake-managed Iceberg tables using DuckDB. Key steps include setting up AWS credentials, configuring the AWS CLI, and installing necessary extensions in DuckDB to support HTTP, S3, and Iceberg queries.
Should You Ditch Spark for DuckDb or Polars?
2024-12-12
This benchmark compares the performance and use cases of Apache Spark with DuckDB and Polars for small data workloads in a Lakehouse architecture. While DuckDB and Polars can marginally outperform Spark at certain scales, the real value lies in strategically mixing and matching engines to leverage their unique strengths.
DuckDB: Running TPC-H SF100 on Mobile Phones
2024-12-06
DuckDB runs the TPC-H benchmark on a mobile phone (iPhone 16 Pro and Samsung Galaxy S24 Ultra) with significantly better performance than expected. The results are compared to a research prototype from 2004, showing DuckDB outperforming it by achieving a runtime of less than 400 seconds on the smartphone.
Turning Your Root URL Into a DuckDB Remote Database
2024-12-01
A technique for hosting a lightweight DuckDB database at a root URL, allowing easy access through ATTACH statements in SQL queries.
15+ COMPANIES USING DUCKDB IN PRODUCTION: A COMPREHENSIVE GUIDE
2024-11-12
DuckDB is an in-memory, zero-copy database system that has gained popularity across various industries and applications. This article delves into how DuckDB can be implemented in your enterprise environment to address common challenges such as high-performance data processing, lightweight compute scenarios, pipeline optimization, interactive data apps, and secure handling of large datasets.
DuckDB Is Not a Data Warehouse
2024-11-04
DuckDB, a tool focused on columnar data and fast querying capabilities, has gained popularity among analytics engineers. However, the author argues that DuckDB is not suitable as an enterprise data warehouse due to its deployment model and scalability limitations. Instead, he suggests that smaller companies can adopt DuckDB or related extensions for cost-effective solutions.
DuckDB over Pandas/Polars
2024-11-01
Paul Gross compares DuckDB to Polars and Pandas for data analysis, highlighting the SQL-like syntax of DuckDB as easier to use and more familiar to him. He also shares updates on improved code practices with Polars.
Report with all data