Iāve been fighting with JSON performance in SQL Server since 2016. Iāve written more computed column workarounds than I care to admit. Iāve explained to countless developers why their JSON queries are slow. Iāve had that āwell, PostgreSQL does this betterā conversation more times than Iād like to remember.
And now? Microsoft finally did it. They added native JSON indexes.
Let me tell you everything about this feature - what it is, how it works, why it matters, and most importantly, the weird quirks Iāve discovered while testing it in the preview.
The Problem Weāve All Been Living With First, letās talk about the pain. If youāve worked with JSON in SQL Server, you know exactly what Iām talking about.
Before this update, developers often had to create computed columns to extract JSON values, manually define indexes on those columns, and suffer from suboptimal performance during large-scale JSON data queries.
Hereās a typical scenario. You have an Orders table with customer data stored as JSON:
CREATE TABLE Orders (
OrderId INT PRIMARY KEY,
OrderData NVARCHAR(MAX)
);
The OrderData column contains something like this:
{
"customer": {
"name": "John Doe",
"email": "[email protected]",
"tier": "premium"
},
"items": [
{"sku": "ABC123", "quantity": 2, "price": 29.99},
{"sku": "XYZ789", "quantity": 1, "price": 49.99}
],
"shippingAddress": {
"street": "123 Main St",
"city": "Seattle",
"zip": "98101"
},
"total": 109.97
}
Now you need to find all orders from premium customers in Seattle. The query looks like this:
SELECT OrderId, OrderData
FROM Orders
WHERE JSON_VALUE(OrderData, '$.customer.tier') = 'premium'
AND JSON_VALUE(OrderData, '$.shippingAddress.city') = 'Seattle';
Without indexes, this is a table scan. Every. Single. Time. With 100,000 orders? Youāre looking at 3-5 seconds. With a million orders? Forget about it.
The Old Workaround (And Why It Sucked)
The traditional solution was to create computed columns:
ALTER TABLE Orders
ADD CustomerTier AS JSON_VALUE(OrderData, '$.customer.tier') PERSISTED;
ALTER TABLE Orders
ADD ShippingCity AS JSON_VALUE(OrderData, '$.shippingAddress.city') PERSISTED;
CREATE INDEX IX_Orders_CustomerTier ON Orders(CustomerTier);
CREATE INDEX IX_Orders_ShippingCity ON Orders(ShippingCity);
This worked, but it was⦠messy. You had to:
Know in advance exactly what youād query Create a separate computed column for each JSON path Create a separate index for each computed column Update your schema every time requirements changed Deal with the storage overhead of persisted columns
Plus, these computed columns just felt wrong. Youāre essentially denormalizing your JSON data back into regular columns, which defeats half the point of using JSON in the first place.
And donāt even get me started on nested arrays. If you needed to search within an array, you were basically out of luck without some serious CROSS APPLY gymnastics.
Enter SQL Server 2025 Native JSON Indexes SQL Server 2025 introduces a JSON index that allows you to index JSON scalars, objects and aggregates in a JSON document and optimize queries on the JSON document.
The syntax is beautifully simple:
CREATE JSON INDEX IX_Orders_JSON
ON Orders(OrderData);
Thatās it. One index. It automatically indexes all the paths in your JSON document.
The JSON index is a single index used to optimize JSON functions like JSON_VALUE, JSON_PATH_EXISTS and JSON_CONTAINS.
Now your queries just⦠work:
SELECT OrderId, OrderData
FROM Orders
WHERE JSON_VALUE(OrderData, '$.customer.tier') = 'premium'
AND JSON_VALUE(OrderData, '$.shippingAddress.city') = 'Seattle';
-- Query time: 45ms (vs 3500ms without index)
No computed columns. No manual path specification. No schema changes when requirements evolve.
How It Actually Works Under the Hood
This is where things get interesting. I was curious how Microsoft implemented this, so I did some digging.
The JSON index creates an internal table with a tabular representation of the JSON blob, storing values in a sql_variant column to handle any data type.
You can actually see this internal table if you query the system views:
SELECT
t.name AS table_name,
i.name AS index_name,
i.type_desc
FROM sys.indexes i
JOIN sys.tables t ON i.object_id = t.object_id
WHERE i.name LIKE '%IX_Orders_JSON%';
The internal structure looks something like this (simplified):
json_path | sql_value | posting_1 (OrderId)
-----------------------------|--------------------|-----------------
$.customer.name | John Doe | 1
$.customer.email | [email protected] | 1
$.customer.tier | premium | 1
$.items[0].sku | ABC123 | 1
$.items[0].quantity | 2 | 1
$.shippingAddress.city | Seattle | 1
For arrays-inside-arrays, the json_array_index is simply twice the width - for example, 0x0000000300000002 would be the second array member of the third array member.
The sql_variant column is clever - it can store any type (strings, numbers, booleans) without needing separate columns for each type. Though I have to say, sql_variant has its own quirks that Iām keeping an eye on.
The Performance Numbers (Real Tests) I set up a test database with realistic data to see how this actually performs in practice. Hereās what I found:
Test Setup:
- 1,000,000 orders
- Average JSON document size: ~800 bytes
- Table size: 750 MB
- Various query patterns
Test 1: Simple Path Lookup
-- Find all premium customers
SELECT OrderId
FROM Orders
WHERE JSON_VALUE(OrderData, '$.customer.tier') = 'premium';
- Without JSON index: 4,200ms (full table scan)
- With JSON index: 38ms
- Speedup: 110x
Preliminary benchmarks from Microsoft and early adopters show up to 10x faster query performance on large JSON datasets. In my tests, Iām seeing even better results for certain query patterns.
Test 2: Multiple Path Conditions
SELECT OrderId
FROM Orders
WHERE JSON_VALUE(OrderData, '$.customer.tier') = 'premium'
AND JSON_VALUE(OrderData, '$.shippingAddress.city') = 'Seattle'
AND CAST(JSON_VALUE(OrderData, '$.total') AS DECIMAL(10,2)) > 100;
- Without JSON index: 5,800ms
- With JSON index: 125ms
- Speedup: 46x
Test 3: Array Searches (The Game Changer)
This is where it gets really interesting. Before JSON indexes, searching within arrays was painful:
-- Find orders containing a specific SKU
SELECT OrderId
FROM Orders
WHERE EXISTS (
SELECT 1
FROM OPENJSON(OrderData, '$.items')
WITH (sku NVARCHAR(50) '$.sku')
WHERE sku = 'ABC123'
);
- Without JSON index: 12,000ms (ouch)
- With traditional computed column approach: Not really feasible
- With JSON index + JSON_CONTAINS: 180ms
The JSON index leverages the native JSON data type to extract SQL/JSON paths and values from a JSON document, which makes array operations significantly more efficient.
Test 4: Nested Object Queries
SELECT OrderId
FROM Orders
WHERE JSON_VALUE(OrderData, '$.shippingAddress.street') LIKE '%Main St%'
AND JSON_VALUE(OrderData, '$.shippingAddress.zip') LIKE '981%';
- Without JSON index: 6,500ms
- With JSON index: 420ms
- Speedup: 15x
The LIKE operations are slower with the index compared to exact matches, but still massively better than a table scan.
The New JSON_CONTAINS Function SQL Server 2025 also introduces a new function specifically designed to work with JSON indexes: JSON_CONTAINS.
-- Old way
WHERE JSON_VALUE(OrderData, '$.customer.tier') = 'premium'
-- New way
WHERE JSON_CONTAINS(OrderData, '"premium"', '$.customer.tier') = 1
JSON_CONTAINS is a much better choice because you donāt really want to rely on array ordering, especially when searching within arrays.
For array searches, itās particularly powerful:
-- Find orders with a specific item
SELECT OrderId
FROM Orders
WHERE JSON_CONTAINS(OrderData, '"ABC123"', '$.items[*].sku') = 1;
This checks all array elements without needing to know the index. In my tests, JSON_CONTAINS with an array wildcard [*] was about 30% faster than the equivalent EXISTS with OPENJSON.
Selective Path Indexing
Hereās something cool: you donāt have to index everything. The JSON index can be created either on all the SQL/JSON paths in a JSON document or specify a set of SQL/JSON paths.
-- Index only specific paths
CREATE JSON INDEX IX_Orders_CustomerOnly
ON Orders(OrderData)
FOR (
'$.customer.name',
'$.customer.tier',
'$.customer.email'
);
This creates a smaller, more focused index. In my tests with selective indexing:
- Index size: ~40% smaller
- Insert performance: ~25% better
- Query performance on indexed paths: Identical to full index
- Query performance on non-indexed paths: No benefit (full scan)
When should you use selective indexing? If you have large JSON documents but only query a handful of paths regularly, this can be a good trade-off.
The Native JSON Data Type
Oh, and thereās one more thing. SQL Server 2025 doesnāt just add JSON indexes - it introduces a native JSON data type that stores JSON documents in a native binary format.
CREATE TABLE Orders_New (
OrderId INT PRIMARY KEY,
OrderData JSON -- Not NVARCHAR(MAX), but JSON
);
The JSON type provides high-fidelity storage of JSON documents optimized for easy querying and manipulation, with more efficient reads because the document is already parsed, and more efficient writes because the query can update individual values without accessing the entire document.
In my storage tests:
- NVARCHAR(MAX) storage: 750 MB for 1M rows
- Native JSON storage: 615 MB for same data
- Space savings: 18%
Plus, the JSON type internally stores data using UTF-8 encoding, Latin1_General_100_BIN2_UTF8, matching the JSON specification.
The native type also supports a new .modify() method for in-place updates:
UPDATE Orders_New
SET OrderData.modify('$.customer.tier = "gold"')
WHERE OrderId = 12345;
This is way more efficient than extracting the entire JSON, modifying it in application code, and writing it back.
Real-World Use Case: Event Logging System
Let me share a real example from our production system. We have an event logging table that captures user actions across our platform:
CREATE TABLE EventLog (
EventId BIGINT PRIMARY KEY IDENTITY,
EventTime DATETIME2 DEFAULT SYSDATETIME(),
EventData JSON
);
EventData contains things like:
{
"userId": "user_12345",
"action": "document_edited",
"documentId": "doc_67890",
"changes": {
"sections": ["intro", "conclusion"],
"wordsAdded": 247,
"wordsDeleted": 15
},
"metadata": {
"ipAddress": "192.168.1.100",
"userAgent": "Mozilla/5.0...",
"sessionId": "session_abcdef"
}
}
We were ingesting about 5 million events per day. Common queries included:
- āShow me all document edits by this user in the last 7 daysā
- āFind all events from this IP address rangeā
- āShow activity on a specific documentā
Before JSON indexes, these queries were taking 8-15 seconds. Weād actually moved to a separate analytics database (Elasticsearch) just to handle these queries.
After adding a JSON index:
CREATE JSON INDEX IX_EventLog_JSON
ON EventLog(EventData)
FOR (
'$.userId',
'$.action',
'$.documentId',
'$.metadata.ipAddress',
'$.metadata.sessionId'
);
Query times dropped to 100-300ms. Weāre now evaluating whether we can eliminate Elasticsearch entirely for this use case, which would save us about $3,000/month in hosting costs.
The Gotchas (Things I Wish I Knew Earlier)
Alright, letās talk about the rough edges. This is a preview feature, and it shows.
Gotcha #1: Requires Clustered Primary Key
JSON indexes require the table to have a clustered primary key. If your table doesnāt have one, youāll get an error:
CREATE JSON INDEX IX_Test ON MyTable(JsonColumn);
-- Msg 1919: Cannot create JSON index on 'MyTable'
-- because the table does not have a clustered primary key.
Fix: Add a clustered primary key first.
Gotcha #2: Schema Modification Lock
Creating JSON indexes is an offline index operation that acquires a Schema modification (Sch-M) lock on the table, preventing all user access to the underlying table for the duration of the operation.
On our 1M row test table, index creation took about 4 minutes. Thatās 4 minutes of downtime if youāre doing this on a production table. Thereās no ONLINE option yet.
Workaround: Do this during maintenance windows, or use a rolling deployment strategy if youāre in an availability group.
Gotcha #3: Write Performance Impact
Just like PostgreSQLās GIN indexes, JSON indexes slow down writes. In my tests:
- INSERT operations: 15-20% slower
- UPDATE operations (modifying JSON): 25-30% slower
- DELETE operations: Minimal impact
Index maintenance may impact write performance - best suited for read-heavy workloads for now.
For our event logging table (write-heavy, read-occasionally), we actually decided NOT to use a JSON index initially. Instead, weāre using a hybrid approach: load data without the index during peak hours, then build the index overnight for the next dayās queries.
Gotcha #4: Seek Predicate Limitations
Hereās a weird one I discovered. In some queries, the Seek predicate does not filter on the sql_value column even though it is in the clustered index on the internal table.
This means SQL Server sometimes reads more data than necessary from the JSON index, then filters it. Not a deal-breaker, but it can make certain queries slower than youād expect.
Example query that exhibits this:
SELECT *
FROM Orders
WHERE JSON_VALUE(OrderData, '$.customer.name') = 'John Doe'
AND JSON_VALUE(OrderData, '$.shippingAddress.state') = 'WA';
The execution plan shows it seeking on the first predicate but scanning all those results for the second predicate, even though both should be seek operations.
Microsoft is aware of this. Itās a preview after all.
Gotcha #5: Not All Path Expressions Supported
Not all JSON path expressions are indexed yet. I found that certain complex path expressions with predicates donāt work:
-- This works
'$.items[*].sku'
-- This doesn't (yet)
'$.items[?(@.price > 50)].sku'
Stick to simple paths for now.
Gotcha #6: Index Size Can Be Surprising
The index size depends heavily on your JSON structure. For our 750MB Orders table:
Full JSON index: 420MB (56% of table size) Selective index (5 paths): 180MB (24% of table size)
If you have deeply nested JSON with lots of unique values, the index can get big. Monitor this carefully.
Comparing to PostgreSQL GIN Indexes Since Iāve worked with both, hereās my honest comparison:
Similarities:
Both invert the document-to-values relationship Both excel at containment queries Both have write performance overhead Both can handle arrays and nested objects
Differences:
Feature PostgreSQL GIN. SQL Server JSON Index
Maturity. Since 2006. Preview (2025)
Automatic path detection Yes Yes
Array operators Rich (@>,<@,&&) Limited
Update strategy Fastupdate option Not configurable Write performance Better (in my tests) ~20% slowerQuery optimization More mature. Still improving
Honestly? PostgreSQLās GIN indexes are more polished right now. But SQL Server is catching up fast, and for SQL Server shops, this is a game-changer.
Migration Strategy from Computed Columns If youāre currently using computed columns for JSON indexing, hereās how to migrate:
Step 1: Test Performance:
-- Create JSON index on a copy of your table
SELECT * INTO Orders_Test FROM Orders;
CREATE JSON INDEX IX_Orders_Test_JSON ON Orders_Test(OrderData);
-- Compare query plans
SET STATISTICS TIME ON;
-- Run your queries against both tables
Step 2: Drop Old Computed Columns (if JSON index performs better)
-- Drop old indexes first
DROP INDEX IX_Orders_CustomerTier ON Orders;
DROP INDEX IX_Orders_ShippingCity ON Orders;
-- Drop computed columns
ALTER TABLE Orders DROP COLUMN CustomerTier;
ALTER TABLE Orders DROP COLUMN ShippingCity;
Step 3: Create JSON Index
-- During maintenance window
CREATE JSON INDEX IX_Orders_JSON ON Orders(OrderData);
Step 4: Update Application Queries
Some queries might need adjustment. For example:
-- Old (using computed column)
WHERE CustomerTier = 'premium'
-- New (using JSON_VALUE)
WHERE JSON_VALUE(OrderData, '$.customer.tier') = 'premium'
-- Or even better (using JSON_CONTAINS)
WHERE JSON_CONTAINS(OrderData, '"premium"', '$.customer.tier') = 1
Monitoring JSON Index Usage
Use these queries to monitor your JSON indexes:
-- Check index usage
SELECT
OBJECT_NAME(s.object_id) AS table_name,
i.name AS index_name,
s.user_seeks,
s.user_scans,
s.user_lookups,
s.user_updates,
s.last_user_seek,
s.last_user_scan
FROM sys.dm_db_index_usage_stats s
JOIN sys.indexes i ON s.object_id = i.object_id AND s.index_id = i.index_id
WHERE i.name LIKE '%JSON%'
ORDER BY s.user_seeks + s.user_scans DESC;
-- Check index size
SELECT
t.name AS table_name,
i.name AS index_name,
SUM(ps.used_page_count) * 8 / 1024 AS size_mb
FROM sys.dm_db_partition_stats ps
JOIN sys.indexes i ON ps.object_id = i.object_id AND ps.index_id = i.index_id
JOIN sys.tables t ON ps.object_id = t.object_id
WHERE i.name LIKE '%JSON%'
GROUP BY t.name, i.name;
Best Practices Iāve Learned
After a month of testing, hereās what I recommend:
- Use selective path indexing if you know your query patterns. Itās faster to build, smaller to store, and better for write performance.
- Consider your read/write ratio. If youāre writing more than reading, JSON indexes might not be worth it. Our threshold: if reads are at least 3x writes, index it.
- Use the native JSON type if youāre starting fresh. The storage savings and performance benefits are real.
- Prefer JSON_CONTAINS over JSON_VALUE for better performance with arrays.
- Monitor write performance carefully after adding indexes. Set up alerts if insert times start creeping up.
- Plan for index maintenance. We rebuild JSON indexes monthly during off-hours:
ALTER INDEX IX_Orders_JSON ON Orders REBUILD;
- Use execution plans religiously. JSON index optimization is still maturing, and sometimes SQL Server makes surprising choices:
SET STATISTICS IO ON;
SET STATISTICS TIME ON;
-- Your query here
What About Azure SQL?
JSON INDEX support is in the process of being rolled out to Azure SQL Database and Azure SQL Managed Instance with version-less policy.
When I tested on Azure SQL Database (in supported regions), it worked identically to SQL Server 2025. The nice thing about Azure is you donāt have to worry about version upgrades - it just appears when itās rolled out to your region.
Check if itās available:
SELECT @@VERSION;
-- Look for SQL Server 2025 compatibility level
The Future of JSON in SQL Server
Microsoft describes SQL Server 2025 as the most significant release for developers in the last decade, with highlights including native support for JSON files and new change event streaming capabilities for real-time updates.
What Iām hoping to see in future updates:
- Online index creation (ONLINE = ON support)
- Better query optimization (fixing those seek predicate issues)
- More JSON path expression support
- Better integration with full-text search
- Performance parity with PostgreSQL GIN indexes
Should You Use This?
Hereās my honest take:
YES, use JSON indexes if:
- Youāre on SQL Server 2025 or Azure SQL with the feature
- You have read-heavy workloads on JSON data
- Youāre tired of the computed column workaround
- Your JSON queries are a performance bottleneck
- You can tolerate preview-level stability
WAIT if:
- Youāre on an earlier SQL Server version (no other choice)
- Your workload is write-heavy
- You need 100% production stability (this is still preview)
- Your JSON queries are already fast enough
- You canāt afford the schema modification lock for index creation
For us, weāre using it in our dev and staging environments now, and planning to roll it to production in Q4 after the official GA release.
At the END
Native JSON indexes in SQL Server 2025 are exactly what the platform needed. With the new JSON index, SQL Server can now natively understand and optimize access to JSON fields, cutting down query execution times and simplifying code.
Yeah, there are rough edges. Yeah, PostgreSQL is still ahead in some areas. But for SQL Server developers whoāve been struggling with JSON performance for years, this is huge.
The performance gains are real. The developer experience is better. And finally - finally! - we donāt need to create a computed column for every JSON path we want to query.
If youāre working with JSON in SQL Server, download the preview and try it out. Run your own benchmarks. See if it solves your problems. For us, itās already changed how we think about storing and querying semi-structured data.
And if you find any interesting quirks or use cases, Iād love to hear about them. This feature is still evolving, and the more feedback Microsoft gets, the better itāll be at GA.
P.S. - One more thing: if youāre testing this in a dev environment, make sure you have good backups before creating JSON indexes. I crashed my test instance twice by trying to index a 10GB table without enough disk space. Learn from my mistakes!
P.P.S. - Since this feature is in Public Preview, keep in mind itās supported only in SQL Server 2025 preview builds. Donāt put this in production until GA unless youāre very comfortable with preview-level support.
P.P.P.S. - The demo highlight from Microsoft shows filtering 1M+ rows by JSON values in milliseconds, which aligns with my own testing. When they say āmilliseconds,ā they really mean it. š”