Glean 拾遗
Tag

Database

4 picks

All picks tagged “Database”.

Recent picks

4picks · chronological

05-29

ClickHouse 10 Best Practices

A ClickHouse solution architect shares 10 field-tested best practices derived from customer engagements, covering schema design, data types, partitioning, skipping indexes, JSON type, data ingestion, materialized views, system tables, ReplacingMergeTree, and JOIN optimization. Benchmarks on a 150M-row Amazon reviews dataset quantify the impact: proper ORDER BY reduces rows scanned by 347×, unnecessary partitioning slows queries by 46×, correct data types cut storage by 12% and double query speed, skipping indexes reduce scans by 80%, and dictionary lookups beat regular JOINs by nearly 3×. The article emphasizes understanding ClickHouse internals to achieve orders-of-magnitude improvements without hardware changes.

www.infoq.cn · 15 min · Database · Performance
05-29

ClickStack Observability: MCP Server, AI Notebooks, and ClickStack Cloud

At Open House, ClickHouse announced three major observability updates: ClickStack Cloud (serverless, managed, private preview), AI Notebooks (beta), and an open-source ClickStack MCP server. AI Notebooks replace linear chat with persistent, branchable investigation workspaces, exposing every query and step. The MCP server provides semantic investigative tools to external agents; internal benchmarks show 25% fewer tool calls, 2.5× consistency improvement, and 20% higher evaluation scores vs. raw SQL MCP. The server also supports bi‑directional orchestration: agents can create dashboards and persist results. The design philosophy is “bring your own agents,” with SQL as an escape hatch when pre‑built tools fall short. The post includes setup instructions and a demo. For infrastructure/SRE engineers evaluating ClickHouse-based observability.

clickhouse.com · 15 min · Agents · AI · Database
05-29

From OTel to Rotel: 4x Throughput Increase in PB-Scale Tracing

This article benchmarks OpenTelemetry data planes for writing trace spans to ClickHouse. On the same 8‑core host, Rotel achieves 3.7 million spans/sec (462k spans/core/sec), a >4× improvement over the OTel Collector. Gains come from three optimizations: binary encoding of JSON columns in RowBinary, moving deserialization to a shared thread pool to avoid tokio blocking and glibc allocator lock contention, and enabling fast LZ4 compression. The test also exposes silent data loss in the OTel Collector under backpressure. For engineers scaling large telemetry pipelines.

www.infoq.cn · 18 min · Database · Performance
05-29

Introducing ClickHouse Agent Skills

ClickHouse has released official Agent Skills: an open-source set of 28 prioritized best-practice rules covering schema design, query optimization, and data ingestion, packaged using Anthropic's Agent Skills specification. Users can add them locally with `npx skills add clickhouse/agent-skills`. AI agents (e.g., Claude Code) automatically invoke these rules when appropriate, helping avoid common pitfalls like wrong ORDER BY, non-scalable JOINs, or missing materialized views. The Apache 2.0-licensed repo welcomes community contributions.

clickhouse.com · 3 min · Agents · AI · Database