Notifications
Clear all

SQL Server - Logical Reads to get snapshot of a table as of a certain day (INSERT only)


Posts: 1
Topic starter
(@stevenzimmer)
New Member
Joined: 2 years ago

Hello All,

I am wondering if I am generating a normal amount of logical reads required in SQL Server to get the most current records as of a certain date. It seems like approx 12 logical reads are required per record on a generic skeleton table I created with SHA2_256 keys.

Background:
My current SQL Server Version is 2012 (11.0.7462.6)

I created a test table with similar data changes as real data tables with three columns:

RAND_HKEY binary(32) - Business Key (SHA2_256)
RAND_DIFFKEY binary (32) - Hash Difference (SHA2_256)
Process_Date - System Driven Date for when the record became active

CREATE UNIQUE NONCLUSTERED INDEX [PERF_TEST_INDEX] ON [DV].[Sat_Perf_Test]
(
 [RAND_HKEY] ASC,
 [RAND_DIFFKEY] ASC,
 [Process_date] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO

Unique Business Keys in Table: 1,737,286
Rows in table: 4,038,480

I have been questioned several times why we need pit and bridge tables when our datasets are so small. However, I find logical reads quickly get out of hand even with our small data volumes when combining multiple satellites without using bridge or pit tables. Am I missing something for the doing the basic "get the snapshot of the table as of the current day"?

I tried two different approaches in tsql to get 10000 rows:

Method 1:
Table 'Sat_Perf_Test'. Scan count 10001, logical reads 44043, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

 SQL Server Execution Times:
   CPU time = 63 ms,  elapsed time = 244 ms.

Method 2:
Table 'Worktable'. Scan count 6, logical reads 174570, physical reads 0, read-ahead reads 4231, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Sat_Perf_Test'. Scan count 3, logical reads 117810, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

 SQL Server Execution Times:
   CPU time = 12402 ms,  elapsed time = 7048 ms.

Method 1: SQL
SELECT TOP 10000 pt.[RAND_HKEY]
 ,pt.PROCESS_DATE
FROM [DV].[Sat_Perf_Test] pt
INNER JOIN (
 SELECT [RAND_HKEY]
  ,MAX(Process_Date) AS MAX_PROCESS_DATE
 FROM [DV].[Sat_Perf_Test] pt
 WHERE Process_Date <= '2019-01-01'
 GROUP BY [RAND_HKEY]
 ) max_pt ON pt.[RAND_HKEY] = max_pt.[RAND_HKEY]
 AND pt.PROCESS_DATE = max_pt.MAX_PROCESS_DATE

Method 2: SQL
SELECT TOP 10000 RAND_HKEY
 ,PROCESS_DATE
FROM (
 SELECT RAND_HKEY
  ,Process_Date
  ,LAST_VALUE(Process_Date) OVER (
   PARTITION BY [RAND_HKEY] ORDER BY Process_Date ROWS BETWEEN UNBOUNDED PRECEDING
     AND UNBOUNDED FOLLOWING
   ) AS End_Date
 FROM DV.[Sat_Perf_Test]
 WHERE Process_Date <= '2019-01-01'
 ) V1
WHERE V1.Process_Date = V1.End_Date

Thanks in advance!

Reply
2 Replies