Technology

SQL Indexes: The Practical Guide That Will Save Your Next Deployment

B

Boundev Team

Mar 25, 2026
11 min read
SQL Indexes: The Practical Guide That Will Save Your Next Deployment

Most slow queries are not caused by bad code. They are caused by missing or misused indexes. This guide covers B-Tree internals, composite index design, write costs, and the decision framework that separates production-grade indexing from guesswork.

Key Takeaways

A table with zero indexes can be 100x faster to insert than a table with a single index — the write cost of indexes is real and often underestimated
B-Tree indexes handle 90% of use cases — they support equality checks, range queries, sorting, and prefix searches; Hash indexes are faster only for exact-match lookups on high-cardinality columns
Composite indexes follow the leftmost prefix rule — the column order determines which queries can use the index, and getting it wrong wastes the entire structure
Over-indexing is a production problem — unused indexes consume storage, slow down writes, inflate WAL data, and evict hot pages from your buffer pool
Every index must justify its existence against the write cost it imposes — "we might need it someday" is not a justification

The Query That Shut Down Your API

It is 2 AM. Your pagerduty fires. The API is returning 500 errors. The database is at 100% CPU. Your users are timing out. You dig into the slow query log and find it: a single SELECT on the orders table that is taking 18 seconds. The query looks normal. It uses the right columns. It has a WHERE clause. But it is doing a full table scan on 47 million rows. You add an index. Thirty seconds later, the query runs in 12 milliseconds. The CPU drops to 8%. The incident is closed.

This is the story of most database performance incidents — and it is almost never about bad code. It is about indexes. Either they are missing, they are wrong, or nobody understood the query plan well enough to design them correctly. The teams that have the fewest database incidents are not the ones with the most expensive servers. They are the ones that understand how indexes work at the structural level — not just how to create them, but how the database actually uses them, and what they cost.

This guide is not a reference manual. It is a production engineering guide. We cover what actually matters when you are designing indexes for a system that handles real traffic, real writes, and real consequences when things go wrong.

What an Index Actually Is — and Why It Is Not Free

Most developers learn about indexes through the lens of speed: create an index, queries get faster. This is true, but it is dangerously incomplete. An index is a separate data structure, maintained by the database engine, that stores sorted references to your table data. When you create an index on a column, you are not just making lookups faster — you are creating a new structure that the database must maintain on every single write to that table. That maintenance has a cost, and that cost compounds at scale.

The canonical benchmark that every database engineer should know: a table with zero indexes can be 100x faster to insert than a table with a single index. This is not an exaggeration. The actual data write is trivial. The index maintenance is the expensive part. Every INSERT must update every index on that table. Every UPDATE must update every index on every column that changed. Every DELETE must remove the row from every index. This cost is paid in CPU, I/O, and lock contention — and it is paid on every write, not just once at creation time.

Storage overhead is the second hidden cost. Indexes typically consume 10-30% of the table size per index. A table with 5 indexes on a 100 GB dataset is carrying an additional 50-90 GB of index data that must be cached, maintained, and backed up. For small tables, this is noise. For large tables with billions of rows, this is infrastructure money.

Need a database engineer who understands indexing at the production level?

Boundev places pre-vetted backend engineers with deep database optimization experience — PostgreSQL, MySQL, query planning, and index design — in under 72 hours.

See How We Do It

B-Tree: The Index Type That Handles 90% of Your Use Cases

When you create an index without specifying a type, PostgreSQL, MySQL, SQL Server, and Oracle all default to B-Tree — and for good reason. B-Tree, short for Balanced Tree, is the most versatile index type in any relational database. It handles equality checks, range queries, sorting, and prefix searches. It is the default for almost every query pattern you will encounter in a business application.

The "balanced" in B-Tree is the key property. The tree structure stays balanced regardless of how much data you insert — meaning every lookup takes the same number of steps whether you are searching for the first row or the last row in a billion-row table. A B-Tree with 1 billion rows navigates to any leaf node in roughly 4 levels. That is 4 disk page reads to find a specific row, versus potentially 1 billion reads in a full table scan.

Each level of the tree narrows the search space. The root and intermediate nodes store sorted key ranges and child page pointers. You traverse from the root, comparing your search key against the ranges at each level, and you follow the correct child pointer until you reach the leaf nodes. The leaf nodes contain the actual data pointers — the row locations — in sorted order by the indexed column. That sorted leaf layer is what enables efficient range scans: once you find the start of your range, you simply walk the linked leaf nodes sequentially.

B-Tree indexes excel at: equality searches (WHERE email = 'user@example.com'), range queries (WHERE created_at BETWEEN '2025-01-01' AND '2025-01-31'), prefix searches (WHERE name LIKE 'John%'), sorting (ORDER BY created_at DESC), and composite queries with multiple filter conditions. If your queries use operators like =, <, >, <=, >=, BETWEEN, or IN, B-Tree is almost certainly your right choice.

Composite Indexes: The Leftmost Prefix Rule That Makes or Breaks Performance

A composite index spans multiple columns. The syntax CREATE INDEX idx_user_status ON users(status, created_at) creates an index that stores concatenated values — the status and created_at values together — sorted first by status, then by created_at within each status value. The critical rule that trips up most engineers is the leftmost prefix principle: this index can only be used by queries that filter by the first column, or by the first column plus any subsequent columns in order. A query filtering by status alone uses the index. A query filtering by status and created_at uses the index. A query filtering by created_at alone cannot use the index at all.

Think of it like a phone book sorted by last name, then first name. It is easy to find everyone named Smith. It is easy to find all people named Smith who live in Chicago. But if you want to find everyone who lives in Chicago — regardless of name — the phone book is useless. The index is sorted by the wrong dimension for that query.

Composite Index Column Ordering Framework

Follow this hierarchy when designing composite indexes to maximize their utility.

1. Equality columns first — Columns that use = in your WHERE clause filter most aggressively and should be the leading columns.
2. Range or sort columns last — Columns with >, <, BETWEEN, or ORDER BY should come after equality columns, because ranges terminate the index scan.
3. High-selectivity columns first — The column that filters out the most rows should lead, so the database discards rows as early as possible.
4. Drop redundant single-column indexes — If you have both a single-column index on (A) and a composite index on (A, B), the composite covers the single-column use case and the standalone index is redundant.

Covering indexes (also called index-only scans) are a powerful extension of composite indexes. A covering index contains all the columns needed by a query, so the database can satisfy the entire request from the index without ever touching the table. If you frequently run SELECT user_id, email, status FROM users WHERE user_id = $1, you can create an index that includes those three columns. The database reads the index alone, never touches the table heap, and delivers the result faster.

Hash Indexes: When Fast Is Not Fast Enough

Hash indexes use a hash function to transform keys into fixed-length hash codes, which are stored in a hash table with buckets. The lookup is constant time — O(1) — meaning the lookup speed does not depend on how many rows are in the table. For exact-match lookups on high-cardinality columns, this sounds ideal. In practice, hash indexes see limited use because they have a critical limitation: they do not support range queries, sorting, or prefix searches. You cannot do WHERE api_token > 'abc', because the hash function destroys the ordering relationship between values.

Most databases do not default to hash indexes for good reason. B-Tree indexes perform comparably to hash indexes for equality lookups while offering significantly more functionality. Hash indexes make sense in very specific scenarios: in-memory tables where you need only exact-match access, caching layers with high-volume equality lookups, or unique constraint enforcement where the database uses a hash structure internally. For your typical application queries — filtering by date ranges, sorting by created_at, searching by name prefix — B-Tree is the answer.

Need Database Engineers Who Understand Indexing?

Boundev places pre-vetted engineers with production database experience — from index design to query optimization — in under 72 hours.

Talk to Our Team

The Write Cost Nobody Tells You About

Here is the part of the indexing conversation that most tutorials skip: every index you add is a standing instruction to your database, executed 24 hours a day, 365 days a year. Every time any row in the indexed table changes, the database must update every index that includes the changed columns. This is not theoretical. It has measurable, often surprising consequences in production.

The write amplification problem is the most significant. A table with zero indexes can be 100x faster to insert than a table with a single index. Add eight indexes — not unusual for a well-intentioned team that added an index for every reported slow query — and your insert throughput can degrade by orders of magnitude. This is the tax you pay for every read optimization: a compounding write cost that grows with the number of indexes.

But the write cost is only the beginning. In PostgreSQL, every change to an index page generates Write-Ahead Log entries. More indexes means more WAL data that is shipped to replicas, archived for point-in-time recovery, and processed by backup systems. Unnecessary indexes do not just slow your primary — they increase replication lag, inflate backup storage, and extend recovery time objectives.

Unused indexes create a secondary problem that borders on ironic: they evict hot data from your buffer pool. When your buffer pool is full of cold, unused index pages, the hot pages that serve your real queries get pushed out. An index that nobody uses can make your other indexes slower by stealing the memory that the useful indexes need. In one documented PostgreSQL analysis, 17% of index storage was pure waste — indexes that were never scanned but were actively costing the database performance on every write.

The Decision Framework: When to Add an Index and When Not To

The question is never "will this index make my reads faster?" The answer is almost always yes. The real question is "is the read improvement worth the write cost across every insert, update, and delete this table will process for the lifetime of this index?" If you cannot answer that question with data, you should not create the index yet.

Start by indexing your foreign key columns. This is database optimization 101, but it is surprising how often it is overlooked. Every foreign key should have an index because you will inevitably join on those columns or filter by them. This is the baseline that most tables need.

Then index the columns that appear most frequently in your WHERE clauses and ORDER BY clauses — but only after you have confirmed that the query is actually slow and that the database is not already using an existing index to satisfy it. The EXPLAIN ANALYZE command is your most important tool here. It shows you exactly how the database executes your query, whether it is using an index, and what the actual cost is.

Scenario Index Type Example
Range queries, sorting, general use B-Tree WHERE created_at > '2025-01-01'
Exact match only, high cardinality Hash WHERE api_token = $1
Multiple columns, combined WHERE clause Composite (B-Tree) WHERE status = 'active' AND created_at > $1
Large time-series tables, sequential data BRIN WHERE created_at BETWEEN $1 AND $2 (time-series logs)
Text search, word matching Full-text / GIN WHERE body LIKE '%keyword%'

For write-heavy tables — logging tables, event streams, time-series data — aim for the absolute minimum: a primary key and perhaps one or two carefully chosen indexes. For read-heavy tables with infrequent writes — reference data, configuration tables, product catalogs — you can afford more indexes. The question is never "can this index help?" It is "is this specific read improvement worth this specific write cost, sustained indefinitely?"

How Boundev Solves This for You

Everything in this blog — the B-Tree mechanics, the leftmost prefix rule, the write amplification problem, the decision framework — represents knowledge that most engineering teams learn through expensive production incidents. The teams that avoid those incidents are the ones that have database engineers who think about indexing as an architectural decision, not a reactive fix. At Boundev, we have embedded database engineering expertise into production systems serving millions of rows and complex multi-tenant workloads.

Our dedicated engineering teams include database engineers who design index strategies as part of the architecture — not after a slow query appears in production.

● Proactive index design from day one
● Query plan analysis and optimization

Need a PostgreSQL or MySQL specialist fast? We place pre-vetted backend engineers with production database optimization experience in under 72 hours.

● Database performance specialists
● Immediate contribution to index audits

Outsource the database engineering for your platform — from schema design and index architecture to query optimization and performance monitoring.

● End-to-end database architecture
● Production monitoring and index audits

The Bottom Line

100x
Potential query speedup with proper indexing
4
B-Tree depth for 1 billion rows
100x
Insert slowdown with zero vs. one index
72hrs
Boundev engineer deployment

Need a database engineer to audit your indexes and fix slow queries?

Boundev's engineering teams have optimized database performance for applications serving millions of users — from index design and query planning to production monitoring and incident response.

See How We Do It

Frequently Asked Questions

What is the difference between B-Tree and Hash indexes?

B-Tree indexes maintain sorted order and support range queries, inequality comparisons, and sorting operations. Hash indexes use a hash function for exact-match lookups only, offering O(1) lookup speed for equality predicates but no support for range queries or ordered operations. B-Tree is the default in almost every database for good reason: it handles 90% of use cases. Hash indexes make sense only for specific scenarios: in-memory tables with only exact-match access, high-cardinality columns where constant-time lookup matters more than range support. For your typical application queries, B-Tree is the answer.

How do composite indexes work and why does column order matter?

A composite index stores concatenated values from multiple columns in sorted order. The critical rule is the leftmost prefix principle: the index supports queries that filter by the first column, or by the first column plus any subsequent columns in order. For an index on (status, created_at), queries filtering by status alone or status plus created_at use the index. A query filtering by created_at alone cannot use it. Column order matters because the index is sorted by the first column first, then the second within each first-column value. Equality columns should lead; range or sort columns should come last. Getting this wrong wastes the entire index structure.

How many indexes should a table have?

There is no universal number. The right count depends on your workload balance between reads and writes. Every index improves read queries but slows down INSERT, UPDATE, and DELETE operations proportionally. For read-heavy OLTP workloads, you might have 5-10 indexes on frequently queried columns. For write-heavy tables (logging, events, time-series), aim for the absolute minimum — a primary key and one or two carefully chosen indexes. Every index must justify its existence against the write cost it imposes. If you cannot point to a specific, frequently-executed query that uses an index, that index should not exist. Run regular audits: PostgreSQL's pg_stat_user_indexes shows which indexes have never been scanned.

How do I know if my indexes are actually being used?

Use EXPLAIN ANALYZE before and after adding any index. In PostgreSQL, pg_stat_user_indexes shows idx_scan — the number of times an index has been used. An index with zero scans is not helping your reads; it is only costing your writes. In PostgreSQL: SELECT * FROM pg_stat_user_indexes WHERE idx_scan = 0 AND schemaname = 'public'; will show you every unused index. Run this query quarterly. Unused indexes are not harmless — they consume storage, slow down writes, inflate WAL data, and evict hot pages from your buffer pool. If an index has been unused for months, drop it.

What are BRIN indexes and when should I use them?

BRIN (Block Range Index) indexes are designed for naturally ordered data in very large tables — time-series data, logs, event streams — where the physical order of rows on disk matches the logical order by the indexed column. A BRIN index is dramatically smaller than a B-Tree index because it stores only the min and max values for each block range rather than every individual key. For a table with billions of rows sorted by created_at, a BRIN index might be 1000x smaller than the equivalent B-Tree index. The trade-off is selectivity: BRIN indexes work well when the indexed column's values have strong correlation with physical row order. They do not help for high-cardinality columns with no natural ordering relationship to physical storage.

Free Consultation

Let's Build This Together

You now know what separates production-grade indexing from guesswork. The next step is getting the engineering capacity to implement it.

200+ companies have trusted us to build their engineering teams. Tell us what you need — we will respond within 24 hours.

200+
Companies Served
72hrs
Avg. Team Deployment
98%
Client Satisfaction

Tags

#SQL Indexes#Database Optimization#PostgreSQL#MySQL#Performance Tuning#Backend Development#SQL Performance
B

Boundev Team

At Boundev, we're passionate about technology and innovation. Our team of experts shares insights on the latest trends in AI, software development, and digital transformation.

Ready to Transform Your Business?

Let Boundev help you leverage cutting-edge technology to drive growth and innovation.

Get in Touch

Start Your Journey Today

Share your requirements and we'll connect you with the perfect developer within 48 hours.

Get in Touch