Buffer
cache hit ratio (BCH) defines the
percent of data pages read from buffer cache, Normally sql engine tries to find
requested page from cache if not found there it fetch page from disk. If page
request found in cache is low if raises performance alert.
Nominally
90-95% plus threshold is good for BCH, though it is not 100% correct or
definitive in long time as if for some time BCH ratio is very good but after
some time if performance start decreasing due to low memory BCH ratio doesn’t
show low value as it was high value for long time and low performance for small
time doesn’t change ratio drastically.
declare @i as float
declare @j as float
Set @i = 0
Set @j = 0
SELECT @i= cntr_value FROM sys.dm_os_performance_counters
WHERE
[object_name] LIKE '%Buffer
Manager%'AND [counter_name] = 'Buffer cache hit ratio'
SELECT @j= cntr_value FROM sys.dm_os_performance_counters
WHERE
[object_name] LIKE '%Buffer
Manager%'AND [counter_name] like '%Buffer cache hit
ratio%'
select @i VALUE ,@j basevalue, (@i/@j)*100 as BCHR
Due to its misleading value we should check other memory counter like page life expectancy (PLE) or page reads/sec.
Page life expectancy (PLE) defines the life time of data pages
in buffer cache. If sql doesn’t find pages in buffer it reads data from disk
which is performance issue, so longer a page exist in buffer it results into better
performance.
In general if PLE is 300+ seconds its good and
if low we need to look for more memory or better use of index optimization.
Though we shouldn’t blindly go over 300 sec threshold as sudden drop in in PLE
value (may be over 50% can be a concern) so one should monitor PLE in a time
range.
Also PLE can be dependent over amount of memory
allocated and NUMA architecture.
SELECT object_name,
counter_name, cntr_value FROM sys.dm_os_performance_counters
WHERE
[object_name] LIKE '%Buffer Manager%'AND
[counter_name] = 'Buffer
cache hit ratio'
If memory resource allocated to server is very
high PLE values should be much higher, because no of new pages need to replace
in buffer would not be very high (though we assume transaction on server is
normal).
For eg : 128 gb memory on server amount to
large cache for data pages (8kb) and if transaction is not very high PLE could
come up to 1000 + seconds. In this case
16mb/sec i/o data would result into PLE
equivalent to 1000 , if i/o transaction
is low PLE would be higher and if i/i is high PLE would be lower.
Calculation : ((128*1024*1024)KB/8KB) /1000 sec
We also need to consider NUMA architecture here
for getting good result over PLE :
Most new
systems today use NUMA, and so the buffer pool is split up and managed per NUMA
node, with each NUMA node getting its own lazy
writer thread, managing its own buffer free list, and dealing with node-local
memory allocations. Think of each of these as a mini buffer pool.
PLE counter is calculated by adding the PLE of
each mini buffer pool and then getting the average. It means one node (NUMA)
could be low PLE but while averaging all PLE it doesn’t show low value .For instance,
for a machine with 8 NUMA nodes, with PLE of each being 100, the overall PLE is
1000. If one of them drops to 100, the overall PLE only drops to 775, which
means it would appear normal on first look.
On NUMA machines, you need to be looking at the
Buffer Node:Page Life Expectancy counters for all NUMA nodes otherwise you’re
not getting an accurate view of buffer pool memory pressure and so could be
missing performance issues.
One can see the lazywriter activity for each
NUMA node by looking for the lazywriter threads in sys.dm_exec_requests.
One can tell if your system uses NUMA or not by
querying sys.dm_os_memory_nodes. A non-NUMA system will have 2 nodes listed, 0
and 64/32 (which is the DAC resource and not a true node). A NUMA system will
have additional node entries for each of the hardware NUMA nodes in the system.
If you manually configured Soft-NUMA sys.dm_os_nodes would have the independent
CPU nodes under Soft-NUMA, but you don’t get the extra lazywriters or memory
partitioning per node under Soft-NUMA.
select * from sys.dm_os_nodes
select * from sys.dm_os_memory_clerks
select * from sys.dm_os_memory_nodes
To see PLE for NUMA node :
SELECT * FROM sys.dm_os_performance_counters
WHERE
[counter_name] = 'Page
life expectancy'
Memory
pressure doesn't always mean you need to add more physical memory—it may be you
don't have enough memory for your workload.
A server
with a query using 1.5 million logical reads means 11.5 gb of data read could
not be accommodated to 4 gb of buffer, which means we need to tweak query and
with better joining and indexing we can reduce the logical reads and without
adding more memory issue can be resolved.