# SQL vs NoSQL
One of the most important design decisions. There's no winner — it depends on your data shape, access patterns, and scale requirements.
| Aspect | SQL | NoSQL |
|---|---|---|
| Schema | Fixed, enforced | Flexible, schema-on-read |
| Scaling | Primarily vertical | Built for horizontal |
| Queries | Complex JOINs, aggregations | Simple lookups, denormalized |
| Consistency | Strong (ACID) | Eventual (tunable) |
| Best For | Financial, relational data | High write, flexible schema |
# Indexing
An index is a sorted data structure that speeds up lookups. Without an index, the database scans every row (O(n)). With a B-Tree index: O(log n).
-- Without index: Full table scan (1M rows → 1M comparisons) SELECT * FROM users WHERE email = 'alice@example.com'; -- With index on email: B-Tree lookup (~20 comparisons) CREATE INDEX idx_users_email ON users(email); -- Composite index for multi-column queries: CREATE INDEX idx_orders_user_date ON orders(user_id, created_at); -- Supports: WHERE user_id = ? AND created_at > ? -- Also supports: WHERE user_id = ? (leftmost prefix) -- Does NOT support: WHERE created_at > ? (skip first column)
# Replication
Replication copies data across multiple servers for read scaling and fault tolerance.
Writes
│
┌──────▼──────┐
│ Primary │──── replication stream
│ (Leader) │
└──────────────┘
│ │
┌────▼─┐ ┌─▼────┐
│Replica│ │Replica│ ← serve READ queries
│ (1) │ │ (2) │
└───────┘ └───────┘- Synchronous — Primary waits for replica ACK. Consistent but slower writes.
- Asynchronous — Primary doesn't wait. Fast writes but potential stale reads (replication lag).
- Semi-synchronous — Wait for at least 1 replica. Balance of speed and safety.
# Normalization vs Denormalization
Normalization eliminates redundancy (3NF, BCNF). Great for writes — update one place. Denormalization adds redundancy for faster reads — trade write complexity for read speed.
Most production systems are partially denormalized: normalized core tables with strategic denormalization for hot read paths (materialized views, computed columns, cache tables).
⚡ Key Takeaways
- Choose SQL for relational data + ACID; NoSQL for flexible schemas + horizontal scale
- Indexes make reads O(log n) but slow down writes — index selectively
- Replication scales reads and provides fault tolerance; beware of replication lag
- Start normalized, denormalize specific hot paths based on measured needs
- Most real systems use polyglot persistence — multiple databases for different purposes