Monday, January 12, 2015

Page life expectancy (PLE) and Buffer cache hit ratio (BCH)

Buffer cache hit ratio (BCH) defines the percent of data pages read from buffer cache, Normally sql engine tries to find requested page from cache if not found there it fetch page from disk. If page request found in cache is low if raises performance alert.

Nominally 90-95% plus threshold is good for BCH, though it is not 100% correct or definitive in long time as if for some time BCH ratio is very good but after some time if performance start decreasing due to low memory BCH ratio doesn’t show low value as it was high value for long time and low performance for small time doesn’t change ratio drastically.

declare @i as float
declare @j as float
Set @i = 0
Set @j = 0
SELECT @i= cntr_value FROM sys.dm_os_performance_counters
WHERE [object_name] LIKE '%Buffer Manager%'AND [counter_name] = 'Buffer cache hit ratio'
SELECT @j= cntr_value FROM sys.dm_os_performance_counters
WHERE [object_name] LIKE '%Buffer Manager%'AND [counter_name] like '%Buffer cache hit ratio%'
select @i VALUE ,@j basevalue, (@i/@j)*100 as BCHR

Due to its misleading value we should check other memory counter like page life expectancy (PLE) or page reads/sec.

Page life expectancy (PLE) defines the life time of data pages in buffer cache. If sql doesn’t find pages in buffer it reads data from disk which is performance issue, so longer a page exist in buffer it results into better performance.

In general if PLE is 300+ seconds its good and if low we need to look for more memory or better use of index optimization. Though we shouldn’t blindly go over 300 sec threshold as sudden drop in in PLE value (may be over 50% can be a concern) so one should monitor PLE in a time range.

Also PLE can be dependent over amount of memory allocated and NUMA architecture.

SELECT object_name, counter_name, cntr_value FROM sys.dm_os_performance_counters WHERE   [object_name] LIKE '%Buffer Manager%'AND [counter_name] = 'Buffer cache hit ratio'
If memory resource allocated to server is very high PLE values should be much higher, because no of new pages need to replace in buffer would not be very high (though we assume transaction on server is normal).

For eg : 128 gb memory on server amount to large cache for data pages (8kb) and if transaction is not very high PLE could come up to 1000 + seconds.  In this case 16mb/sec  i/o data would result into PLE equivalent to 1000 , if i/o  transaction is low PLE would be higher and if i/i is high PLE would be lower.
Calculation : ((128*1024*1024)KB/8KB) /1000 sec

We also need to consider NUMA architecture here for getting good result over PLE :

Most new systems today use NUMA, and so the buffer pool is split up and managed per NUMA
node, with each NUMA node getting its own lazy writer thread, managing its own buffer free list, and dealing with node-local memory allocations. Think of each of these as a mini buffer pool.

PLE counter is calculated by adding the PLE of each mini buffer pool and then getting the average. It means one node (NUMA) could be low PLE but while averaging all PLE it doesn’t show low value .For instance, for a machine with 8 NUMA nodes, with PLE of each being 100, the overall PLE is 1000. If one of them drops to 100, the overall PLE only drops to 775, which means it would appear normal on first look.

On NUMA machines, you need to be looking at the Buffer Node:Page Life Expectancy counters for all NUMA nodes otherwise you’re not getting an accurate view of buffer pool memory pressure and so could be missing performance issues.

One can see the lazywriter activity for each NUMA node by looking for the lazywriter threads in sys.dm_exec_requests.

One can tell if your system uses NUMA or not by querying sys.dm_os_memory_nodes. A non-NUMA system will have 2 nodes listed, 0 and 64/32 (which is the DAC resource and not a true node). A NUMA system will have additional node entries for each of the hardware NUMA nodes in the system. If you manually configured Soft-NUMA sys.dm_os_nodes would have the independent CPU nodes under Soft-NUMA, but you don’t get the extra lazywriters or memory partitioning per node under Soft-NUMA.

select * from sys.dm_os_nodes
select * from sys.dm_os_memory_clerks
select * from sys.dm_os_memory_nodes

To see PLE for NUMA node :
SELECT * FROM sys.dm_os_performance_counters
WHERE [counter_name] = 'Page life expectancy'

Memory pressure doesn't always mean you need to add more physical memory—it may be you don't have enough memory for your workload.

A server with a query using 1.5 million logical reads means 11.5 gb of data read could not be accommodated to 4 gb of buffer, which means we need to tweak query and with better joining and indexing we can reduce the logical reads and without adding more memory issue can be resolved.

No comments:

Post a Comment