Key Takeaways
Picture this: your application runs smoothly in development. Queries return instantly, the UI feels responsive, and everything clicks into place. Then you launch to production. The database has grown to millions of rows. That innocent-looking query fetching user orders now takes 15 seconds. API timeouts start piling up. Users are complaining. Your manager is asking uncomfortable questions at the standup.
You have been there, or you will be. And 90% of the time, the culprit is hiding in plain sight: missing or poorly designed database indexes. This is not a theoretical problem. Stack Overflow serves over 50 million monthly users with a database that handles millions of queries per day. Their secret? Meticulous index design combined with continuous query optimization.
What Is a Database Index, Really?
Most developers know that indexes speed up queries. But understanding the why behind indexes transforms how you design them. Without an index, when you run a SELECT query, the database performs a full table scan. It reads every single row, checking each one against your WHERE conditions. Imagine searching a phone book by reading every single page instead of using the alphabetical index at the back. That is what a table scan feels like to your database.
An index is a separate data structure, maintained by the database engine, that stores sorted references to table data. When you create an index on a column, the database builds a tree structure where each leaf points to the actual data row. The database engine can traverse this tree in logarithmic time instead of scanning every row linearly. For a table with 10 million rows, this could mean the difference between checking 23 nodes versus reading all 10 million rows.
The Index Overhead Trade-off
Every silver lining has a cloud. Indexes are no exception:
Struggling with slow database queries?
Boundev's backend engineering team specializes in database optimization. We have improved query performance by 100x for applications serving millions of users.
See How We Do ItB-Tree Indexes: The Workhorse of Database Performance
When you create an index without specifying a type, PostgreSQL, MySQL, and most other databases use B-Tree by default. B-Tree stands for Balanced Tree, and that balance is what makes it so efficient. No matter where a value lives in the tree, the database traverses the same number of levels to reach it. This predictability is crucial for consistent query performance under load.
The structure works like an organized filing cabinet. Each level of the tree narrows down the search space. At the root, you have pointers to ranges of values. Those point to intermediate nodes, which further subdivide the ranges, until you reach the leaf nodes that contain the actual data pointers. A B-Tree index with 1 billion rows might only be 4 levels deep. That means the database performs 4 operations instead of 1 billion.
1 Equality Searches
WHERE email = 'user@example.com' — B-Tree excels here, traversing directly to the matching leaf
2 Range Queries
WHERE created_at BETWEEN '2025-01-01' AND '2025-12-31' — B-Tree maintains sorted order for efficient range traversal
3 ORDER BY and GROUP BY
B-Tree stores data in sorted order, eliminating the need for filesort operations
When B-Tree Shines
B-Tree indexes are the most versatile index type. They handle equality checks, range queries, prefix searches (LIKE 'abc%'), and sorting operations. If your queries use operators like =, <, >, <=, >=, BETWEEN, or IN with a range, B-Tree is likely your best choice. MostOLTP workloads benefit primarily from B-Tree indexes because business queries typically involve ranges, sorting, and filtering.
Hash Indexes: Speed for Exact Matches
Hash indexes take a different approach. Instead of maintaining a sorted tree structure, they use a hash function to compute the exact location of each value. When you search for a specific value, the database hashes it, goes directly to that location, and retrieves the data pointer. For exact match queries, this is incredibly fast, often outperforming B-Tree.
However, hash indexes have critical limitations. They only support equality comparisons. You cannot use a hash index to find rows WHERE age > 25, or to ORDER BY results. The hash function destroys any ordering relationship between values. A hash of 100 might be stored next to a hash of 2, with no predictable pattern. This is why PostgreSQL's hash indexes are primarily used for internal system operations and session storage rather than application queries.
Use Case: Hash indexes excel in high-throughput scenarios like session caching or temporary lookup tables where you always query by exact key. If you need range queries, sorting, or prefix matching, stick with B-Tree.
Composite Indexes: The Art of Column Ordering
A composite index spans multiple columns. The syntax CREATE INDEX idx_user_status ON users(status, created_at) creates an index on both status and created_at together. The database stores values concatenated in a sorted order: first by status, then by created_at within each status. This sounds simple, but the implications are profound.
The critical rule is the leftmost prefix principle. Your composite index supports queries that use the first column, or the first column plus any subsequent columns in order. The index on (status, created_at) can satisfy queries filtering by status alone, or by status plus created_at. But it cannot accelerate a query that filters by created_at alone. The database cannot jump to a specific created_at within an unordered status range.
Building high-performance database applications?
Our backend teams have architected database systems for applications processing billions of queries daily. Let us help you design the right indexing strategy.
See How We Do ItDesigning Composite Indexes: A Practical Example
Consider an e-commerce database with an orders table. Common query patterns might include finding orders by customer and date, or filtering by status within a date range. You might create a composite index on (customer_id, order_date, status). This supports queries fetching all orders for a customer, or all orders for a customer within a date range, or all orders for a customer on a specific date with a specific status.
The cardinality of columns should influence your ordering. Put high-cardinality columns first (more distinct values), or columns used in equality predicates first. If you frequently query by status alone, consider whether splitting into two indexes or reversing the order makes sense. There is no universal rule; the right design depends on your specific query patterns. Visit our guide on API design best practices for more context on designing data access patterns.
Ready to Optimize Your Database?
Partner with Boundev to access pre-vetted backend engineers who understand indexing at scale.
Talk to Our TeamPartial and Covering Indexes: Precision Optimization
Standard indexes include all rows in a table. But what if you frequently query only a subset? A partial index includes only rows matching a WHERE condition. If 90% of your queries target active users, create a partial index: CREATE INDEX idx_active_users_email ON users(email) WHERE status = 'active'. This index is smaller, faster to maintain, and fits entirely in memory more easily.
A covering index takes this further by including additional columns needed to satisfy a query without accessing the table itself. When an index contains all columns referenced in a query (in the WHERE, JOIN, ORDER BY, and SELECT clauses), PostgreSQL can answer the query entirely from the index. This is called an index-only scan. You create a covering index by including extra columns after the indexed columns: CREATE INDEX idx_orders_cover ON orders(customer_id, order_date) INCLUDE (total_amount, status). The INCLUDE clause adds columns without making them part of the index key, avoiding bloat while enabling index-only scans.
Advanced Index Types: GiST, GIN, BRIN
Beyond B-Tree and Hash, modern databases offer specialized index types for specific data types and use cases. PostgreSQL leads in index type diversity, providing GiST (Generalized Search Tree) for geometric data and full-text search, GIN (Generalized Inverted Index) for array columns and JSONB, and BRIN (Block Range Index) for naturally ordered data in large tables.
GiST—geospatial data, full-text search, range types
GIN—arrays, JSONB, tsvector for full-text search
BRIN—naturally ordered data in very large tables
Bloom—probabilistic index for multiple columns
Index Maintenance: The Hidden Cost
Every index you create is a promise to the database engine: update this structure whenever data changes. For INSERT operations, the database must add entries to every index on the table. For UPDATE operations, if the indexed column changes, the database removes the old entry and inserts the new one across all indexes. For DELETE, it removes entries from all affected indexes.
This maintenance overhead accumulates. A table with 10 indexes experiences 10 times the write overhead compared to a table with one index. Bulk inserts become dramatically slower. In write-heavy workloads, reducing the number of indexes can improve throughput more than adding indexes improves read performance. You must balance your read optimization against write performance requirements.
Monitoring Index Health
Use database-specific tools to monitor index usage. PostgreSQL provides pg_stat_user_indexes to see scan counts and pg_stat_user_indexes_size for index sizes. MySQL offers SHOW INDEX and the InnoDB metrics table. Regular index audits reveal unused indexes consuming space and slowing writes without benefiting any queries. Remove indexes that have never been scanned or have extremely low usage relative to their size.
Need help optimizing your database infrastructure?
Boundev's database engineers perform comprehensive index audits as part of our backend optimization services. We help you right-size your indexing strategy.
See How We Do ItHow Boundev Solves This for You
Everything we have covered in this blog — slow queries, index design, write overhead, and query optimization — is exactly what our backend engineering team handles every day. Here is how we approach database performance for our clients.
We build you a full backend team including database architects — screened, onboarded, and optimizing from week one.
Plug pre-vetted backend engineers with database expertise into your existing team — fast, seamless integration.
Hand us your backend optimization challenge. We architect, implement, and deliver database solutions at scale.
The Bottom Line
Ready to eliminate slow queries?
Our database engineers have optimized queries for applications serving millions of users. Let us analyze your indexing strategy and identify performance gains.
Get StartedExplore Boundev's Services
Ready to put what you just learned into action? Here is how we can help.
Frequently Asked Questions
What is the difference between B-Tree and Hash indexes?
B-Tree indexes maintain sorted order and support range queries, inequality comparisons, and sorting operations. Hash indexes use a hash function for exact match lookups only, offering O(1) lookup speed for equality predicates but no support for range queries or ordered operations. B-Tree is the default and best choice for most application queries.
How do composite indexes work in SQL?
A composite index spans multiple columns and stores concatenated values in sorted order. The critical rule is the leftmost prefix principle: the index can only be used for queries that filter by the first column, or the first column plus any subsequent columns in order. For an index on (A, B, C), queries filtering by A alone or A plus B work, but queries filtering by B alone or C alone cannot use the index efficiently.
How many indexes should a database table have?
There is no universal number. The right count depends on your workload balance between reads and writes. Each index improves read queries but slows down INSERT, UPDATE, and DELETE operations. For read-heavy OLTP workloads, you might have 5-10 indexes on frequently queried columns. For write-heavy workloads, you might have only 1-2 critical indexes. Regular index audits help identify unused indexes that should be removed.
What is a covering index in PostgreSQL?
A covering index includes all columns needed to satisfy a query, enabling an index-only scan where the database never accesses the table data. In PostgreSQL, you create a covering index using the INCLUDE clause: CREATE INDEX idx_cover ON table(key) INCLUDE (col1, col2). The included columns are stored in the index but not as part of the index key, avoiding sorting overhead while enabling fast index-only scans.
When should I use partial indexes?
Partial indexes are ideal when you frequently query a specific subset of rows. If 90% of your queries target active users, a partial index on active users is smaller, faster to maintain, and more likely to fit entirely in memory. They are also useful for enforcing uniqueness on a subset of rows, or indexing expensive columns only where they meet certain conditions.
Let Us Optimize Your Database
You now understand how to design and implement SQL indexes effectively. The next step is applying this knowledge to your specific workload.
200+ companies have trusted us with their database optimization needs. Tell us about your performance challenges — we will respond within 24 hours.
