DevToolBoxKOSTENLOS
Blog

SQL-Abfrageoptimierung: 15 Techniken zur Datenbankbeschleunigung

14 Min. Lesezeitvon DevToolBox

SQL-Abfrageoptimierung: Indizes, EXPLAIN und N+1-Vermeidung

Langsame Abfragen sind einer der häufigsten Performance-Engpässe in Webanwendungen.

B-Tree-Indizes: Die Grundlage

B-Tree ist der Standard-Indextyp und verarbeitet Gleichheits-, Bereichsabfragen und Sortierung effizient.

-- Understanding B-Tree indexes (default in PostgreSQL, MySQL, SQLite)
-- An index is a separate data structure that maps column values → row locations

-- Without index: full table scan (reads every row)
EXPLAIN SELECT * FROM orders WHERE customer_id = 42;
-- Seq Scan on orders  (cost=0.00..1842.00 rows=18 width=96)

-- Create a B-Tree index on the foreign key
CREATE INDEX CONCURRENTLY idx_orders_customer_id ON orders(customer_id);

-- With index: index scan (jumps directly to matching rows)
EXPLAIN SELECT * FROM orders WHERE customer_id = 42;
-- Index Scan using idx_orders_customer_id on orders
-- (cost=0.29..8.55 rows=18 width=96)

-- Composite index: order matters! (customer_id, created_at) covers:
--   WHERE customer_id = ?
--   WHERE customer_id = ? AND created_at > ?
-- But NOT:  WHERE created_at > ?  (left-prefix rule)
CREATE INDEX idx_orders_customer_date
  ON orders(customer_id, created_at DESC);

-- Partial index: only index rows matching a condition
-- Huge win when you frequently query a small subset
CREATE INDEX idx_orders_pending
  ON orders(created_at)
  WHERE status = 'pending';   -- only ~5% of rows get indexed

GIN- und Ausdrucks-Indizes

GIN ist für mehrwertige Daten wie JSONB, Arrays und Volltextsuche optimiert.

-- GIN (Generalized Inverted Index): for JSONB, arrays, full-text search
-- Much faster than B-Tree for containment queries

-- JSONB column with GIN index
CREATE INDEX idx_products_attrs ON products USING GIN (attributes);

-- Containment query: "find products where attributes includes {color: 'red'}"
SELECT * FROM products
WHERE attributes @> '{"color": "red"}';   -- uses GIN index

-- Full-text search with GIN
ALTER TABLE articles ADD COLUMN tsv tsvector
  GENERATED ALWAYS AS (
    to_tsvector('english', coalesce(title,'') || ' ' || coalesce(body,''))
  ) STORED;

CREATE INDEX idx_articles_fts ON articles USING GIN (tsv);

SELECT title, ts_rank(tsv, q) AS rank
FROM articles, to_tsquery('english', 'postgres & performance') q
WHERE tsv @@ q
ORDER BY rank DESC
LIMIT 10;

-- Expression index: index on a computed value
CREATE INDEX idx_users_email_lower ON users (lower(email));
SELECT * FROM users WHERE lower(email) = lower('User@Example.com');

EXPLAIN ANALYZE lesen

EXPLAIN ANALYZE führt die Abfrage aus und zeigt echte Timing- und Zeilenzahlen.

-- EXPLAIN ANALYZE: actual execution stats (not just estimates)
-- Always use ANALYZE to see real row counts and timing

EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT)
SELECT
  c.name,
  COUNT(o.id) AS order_count,
  SUM(o.total) AS revenue
FROM customers c
JOIN orders o ON o.customer_id = c.id
WHERE o.created_at >= '2025-01-01'
  AND o.status = 'completed'
GROUP BY c.id, c.name
ORDER BY revenue DESC
LIMIT 20;

-- Reading the output:
-- "Seq Scan"     → no index used, reading all rows   (BAD for large tables)
-- "Index Scan"   → using B-Tree index                (good)
-- "Bitmap Scan"  → multiple ranges, batched          (good for many rows)
-- "Hash Join"    → join via hash table               (good for large datasets)
-- "Nested Loop"  → row-by-row join                   (good for small inner set)
-- "actual time=X..Y"  → X=first row, Y=last row (ms)
-- "rows=N"       → if estimate vs actual differ a lot → stale stats!

-- Fix stale statistics:
ANALYZE orders;                          -- update stats for one table
ANALYZE;                                 -- update all tables
-- Or set autovacuum_analyze_scale_factor = 0.01 in postgresql.conf

N+1-Problem: Erkennung und Behebung

Das N+1-Problem tritt auf, wenn Code eine Abfrage für eine Liste ausführt und dann N zusätzliche Abfragen.

-- N+1 Problem: the silent query killer in ORMs

-- BAD: N+1 in pseudo-ORM code
-- for each user (1 query):
--   fetch user's orders (N queries)
-- Total: 1 + N queries for N users

-- Example: fetching 100 users + their order counts = 101 queries
const users = await db.query('SELECT * FROM users LIMIT 100');
for (const user of users) {
  const orders = await db.query(
    'SELECT COUNT(*) FROM orders WHERE customer_id = $1',
    [user.id]
  );
  user.orderCount = orders[0].count;
}

-- GOOD: Single query with JOIN or subquery
SELECT
  u.id,
  u.name,
  u.email,
  COUNT(o.id) AS order_count
FROM users u
LEFT JOIN orders o ON o.customer_id = u.id
GROUP BY u.id, u.name, u.email
LIMIT 100;

-- GOOD: Use IN clause to batch (avoid when set is large)
-- Fetch users first, then batch-load orders
SELECT * FROM orders
WHERE customer_id = ANY($1::int[]);  -- pass array of user IDs

-- GOOD: Prisma/TypeORM eager loading
const users = await prisma.user.findMany({
  include: { orders: true },    -- generates single LEFT JOIN query
  take: 100,
});

Abfrage-Umschreibe-Muster

Viele langsame Abfragen können durch Umschreiben deutlich beschleunigt werden.

-- Query rewrites that dramatically improve performance

-- 1. Replace correlated subquery with JOIN
-- SLOW: correlated subquery executes for each outer row
SELECT name,
  (SELECT COUNT(*) FROM orders o WHERE o.customer_id = c.id) AS cnt
FROM customers c;

-- FAST: single aggregation pass
SELECT c.name, COUNT(o.id) AS cnt
FROM customers c
LEFT JOIN orders o ON o.customer_id = c.id
GROUP BY c.id, c.name;

-- 2. Use window functions instead of self-join
-- SLOW: self-join to find each employee's department average
SELECT e.name, e.salary,
  (SELECT AVG(salary) FROM employees e2 WHERE e2.dept = e.dept) AS dept_avg
FROM employees e;

-- FAST: window function scans table once
SELECT name, salary,
  AVG(salary) OVER (PARTITION BY dept) AS dept_avg
FROM employees;

-- 3. Avoid SELECT * in subqueries (forces extra columns)
-- SLOW
SELECT * FROM (SELECT * FROM large_table) sub WHERE id > 1000;

-- FAST: only select needed columns
SELECT id, name FROM large_table WHERE id > 1000;

-- 4. Use EXISTS instead of COUNT for existence checks
-- SLOW: scans all matching rows
SELECT * FROM products p WHERE (SELECT COUNT(*) FROM reviews r WHERE r.product_id = p.id) > 0;

-- FAST: stops at first match
SELECT * FROM products p WHERE EXISTS (SELECT 1 FROM reviews r WHERE r.product_id = p.id);

Connection Pooling und Batch-Operationen

Datenbankverbindungen sind teuer. Pooling, Prepared Statements und Batch-INSERT verbessern den Durchsatz.

-- Connection pooling & query batching best practices

-- PostgreSQL: use PgBouncer or built-in pooling
-- Connection pool settings (nodejs pg pool)
import { Pool } from 'pg';

const pool = new Pool({
  max: 20,                // max connections (default: 10)
  idleTimeoutMillis: 30000,
  connectionTimeoutMillis: 2000,
  // keepAlive improves performance for long-lived processes
  keepAlive: true,
  keepAliveInitialDelayMillis: 10000,
});

// Prepared statements: parse once, execute many times
await pool.query({
  name: 'get-user-orders',
  text: 'SELECT * FROM orders WHERE customer_id = $1 AND status = $2',
  values: [customerId, 'completed'],
});

-- Batch INSERT with single statement (much faster than loop)
INSERT INTO events (user_id, event_type, created_at)
SELECT * FROM UNNEST(
  $1::int[],
  $2::text[],
  $3::timestamptz[]
);
-- Inserts thousands of rows in one round-trip

-- Use COPY for bulk data loading (fastest option)
COPY orders (customer_id, total, status, created_at)
FROM '/tmp/orders.csv'
WITH (FORMAT CSV, HEADER true);

Index-Typen Vergleich

Index TypeBest ForOperatorsWrite OverheadNotes
B-TreeEquality, range, sort=, <, >, BETWEEN, LIKE prefixLowDefault; most queries
GINMulti-valued (JSONB, arrays)@>, <@, @@, ANYHigh writeFull-text, JSONB queries
GiSTGeometric, full-text&&, @>, <@, ~~MediumNearest-neighbor, overlap
BRINSequential data (timestamps)=, <, >Very lowHuge tables, append-only
HashEquality only=LowFaster than B-Tree for =, no range
PartialSubset of rowsAnyLowWHERE clause filter in index def
ExpressionComputed valuesAnyMediumlower(email), date_trunc()

Häufig gestellte Fragen

Wie erkenne ich, welche Abfragen ich zuerst optimieren sollte?

Aktivieren Sie das Slow-Query-Log und verwenden Sie pg_stat_statements.

Warum kann ein Index Abfragen manchmal verlangsamen?

Index-Scans haben Overhead; der Planer bevorzugt manchmal Sequential Scans.

Was ist der Unterschied zwischen EXPLAIN und EXPLAIN ANALYZE?

EXPLAIN zeigt den geschätzten Plan, EXPLAIN ANALYZE führt die Abfrage tatsächlich aus.

Wie behebt man N+1 in einem ORM?

Verwenden Sie Eager Loading (include/relations) oder DataLoader für Batching.

Verwandte Tools

𝕏 Twitterin LinkedIn
War das hilfreich?

Bleiben Sie informiert

Wöchentliche Dev-Tipps und neue Tools.

Kein Spam. Jederzeit abbestellbar.

Verwandte Tools ausprobieren

{ }JSON FormatterCron Expression Parser

Verwandte Artikel

PostgreSQL Performance Tuning: Indizierung, Abfrageoptimierung & Konfiguration

Umfassender Leitfaden zur PostgreSQL-Leistungsoptimierung — Indizierungsstrategien und Konfiguration.

SQL vs NoSQL: Kompletter Guide zur Datenbankwahl

SQL vs NoSQL verstehen: PostgreSQL, MongoDB, Redis Vergleich.

SQL Joins visuell erklaert: INNER, LEFT, RIGHT, FULL, CROSS

SQL Joins mit visuellen Diagrammen und praktischen Beispielen meistern. INNER JOIN, LEFT JOIN, RIGHT JOIN, FULL OUTER JOIN, CROSS JOIN, SELF JOIN mit Performance-Tipps.