Monday, September 16, 2013

.Net CLR Memory Counters




.Net CLR Memory Counters

Large Object Heap Size:

This counter displays the current size of the Large Object Heap in bytes. Objects greater than 20 Kbytes are treated as large objects by the Garbage Collector and are directly allocated in a special heap; they are not promoted through the generations. This counter is updated at the end of a GC; it is not updated on every allocation.

% Time in GC:

Time in GC is the percentage of elapsed time that was spent in performing a garbage collection (GC) since the last GC cycle. This counter is usually an indicator of the work done by the Garbage Collector on behalf of the application to collect and compact memory. This counter is updated only at the end of every GC, and the counter value reflects the last observed value; it is not an average.

# Bytes in all Heaps

This counter is the sum of four other counters; Gen 0 Heap Size; Gen 1 Heap Size; Gen 2 Heap Size; and the Large Object Heap Size. This counter indicates the current memory allocated in bytes on the GC heaps.

 

# Gen 0 Collections:

 

The youngest, most recently allocated are garbage collected (Gen 0 GC) since the start of the application. Gen 0 GC occurs when the available memory in generation 0 is not sufficient to satisfy an allocation request. This counter is incremented at the end of a Gen 0 GC. Higher generation GCs include all lower generation GCs. This counter is explicitly incremented when a higher generation (Gen 1 or Gen 2) GC occurs. _Global_ counter value is not accurate and should be ignored. This counter displays the last observed value.

 

# Gen 1 Collection:

 

This counter displays the number of times the generation 1 objects are garbage collected since the start of the application. The counter is incremented at the end of a Gen 1 GC. Higher generation GCs include all lower generation GCs. This counter is explicitly incremented when a higher generation (Gen 2) GC occurs. _Global_ counter value is not accurate and should be ignored. This counter displays the last observed value.

 

# Gen 2 Collections:

This counter displays the number of times the generation 2 objects (older) are garbage collected since the start of the application. The counter is incremented at the end of a Gen 2 GC (also called full GC). _Global_ counter value is not accurate and should be ignored. This counter displays the last observed value.

# of Pinned Objects:

This counter displays the number of pinned objects encountered in the last GC. This counter tracks the pinned objects only in the heaps that were garbage collected; e.g., a Gen 0 GC would cause enumeration of pinned objects in the generation 0 heap only. A pinned object is one that the Garbage Collector cannot move in memory.

 

Exceptions:

 

# of Exceps Thrown/Sec:

This counter displays the number of exceptions thrown per second. These include both .NET exceptions and unmanaged exceptions that get converted into .NET exceptions; e.g., null pointer reference exception in unmanaged code would get rethrown in managed code as a .NET System.NullReferenceException; t his counter includes both handled and unhandled exceptions. Exceptions should only occur in rare situations and not in the normal control flow of the program; this counter was designed as an indicator of potential performance problems due to large (>100s) rate of exceptions thrown. This counter is not an average over time; it displays the difference between the values observed in the last two samples divided by the duration of the sample interval.

 

Throw to Catch Depth/Sec:

This counter displays the number of stack frames traversed from the frame that threw the .NET exception to the frame that handled the exception per second. This counter resets to 0 when an exception handler is entered; so nested exceptions would show the handler to handler stack depth. This counter is not an average over time; it displays the difference between the values observed in the last two

 

Locking and Threading:

 

Contention Rate/Sec:
Rate at which threads in the runtime attempt to acquire a managed lock unsuccessfully. Managed locks can be acquired in many ways: by the "lock" statement in C# or by calling System.Monitor.Enter or by using MethodImplOptions.Synchronized custom attribute.

 

# of Current Logical Threads:

This counter displays the number of current .NET thread objects in the application. A .NET thread object is created either by new System.Threading.Thread or when an unmanaged thread enters the managed environment. This counter maintains the count of both running and stopped threads. This counter is not an average over time; it just displays the last observed value.

 

# of Current Physical Threads:

This counter displays the number of native OS threads created and owned by the CLR to act as underlying threads for .NET thread objects. This counter’s value does not include the threads used by the CLR in its internal operations; it is a subset of the threads in the OS process.

 

 

# of Current Recognized Threads:

This counter displays the number of threads currently recognized by the CLR; they have a  orresponding .NET thread object associated with them. These threads are not created by the CLR; they are created outside the CLR but have since run inside the CLR at least once. Only unique threads are tracked; threads with the same thread ID reentering the CLR or recreated after thread exit are not counted twice.

 

 

 

Performance Test Benefits




Performance Test Benefits

Term
Benefits
Challenges and Areas Not Addressed
Performance test
·         Determines the speed, scalability and stability characteristics of an application, thereby providing an input to making sound business decisions.
·         Focuses on determining if the user of the system will be satisfied with the performance characteristics of the application.
·         Identifies mismatches between performance-related expectations and reality.
·         Supports tuning, capacity planning, and optimization efforts.
·         May not detect some functional defects that only appear under load.
·         If not carefully designed and validated, may only be indicative of performance characteristics in a very small number of production scenarios.
·         Unless tests are conducted on the production hardware, from the same machines the users will be using, there will always be a degree of uncertainty in the results.
Load test
·         Determines the throughput required to support the anticipated peak production load.
·         Determines the adequacy of a hardware environment.
·         Evaluates the adequacy of a load balancer.
·         Detects concurrency issues.
·         Detects functionality errors under load.
·         Collects data for scalability and capacity-planning purposes.
·         Helps to determine how many users the application can handle before performance is compromised.
·         Helps to determine how much load the hardware can handle before resource utilization limits are exceeded.
·         Is not designed to primarily focus on speed of response.
·         Results should only be used for comparison with other related load tests.
Stress test
·         Determines if data can be corrupted by overstressing the system.
·         Provides an estimate of how far beyond the target load an application can go before causing failures and errors in addition to slowness.
·         Allows you to establish application-monitoring triggers to warn of impending failures.
·         Ensures that security vulnerabilities are not opened up by stressful conditions.
·         Determines the side effects of common hardware or supporting application failures.
·         Helps to determine what kinds of failures are most valuable to plan for.
·         Because stress tests are unrealistic by design, some stakeholders may dismiss test results.
·         It is often difficult to know how much stress is worth applying.
·         It is possible to cause application and/or network failures that may result in significant disruption if not isolated to the test environment.
Capacity test
·         Provides information about how workload can be handled to meet business requirements.
·         Provides actual data that capacity planners can use to validate or enhance their models and/or predictions.
·         Enables you to conduct various tests to compare capacity-planning models and/or predictions.
·         Determines the current usage and capacity of the existing system to aid in capacity planning.
·         Provides the usage and capacity trends of the existing system to aid in capacity planning
·         Capacity model validation tests are complex to create.
·         Not all aspects of a capacity-planning model can be validated through testing at a time when those aspects would provide the most value.

 

Performance Test Results and Analysis: Client Side


Performance Test Results and Analysis: Client Side

The following section describe the client side statistics and performance observation

Hits per second statistics:

Acceptable Criteria: Always hits per/sec graph behavior should be equivalent to the running users. (Users load) graph

Error statistics:

Error statistics observed on the client side is 0% which is normal in the load test execution.


The following graph represents the response time statistics observed on the client side for Criminal AFIS scenarios.
 
TPS:
Transactions per second we need to check from the client environment.

Throughput
Acceptable Criteria: The MAX Network Band width utilization can be 80% of total band width.

Windows Server Performance Counteres Analysis




Performance Testing Counters:

Server Side:

Windows Server:

1.       Available Mbytes

Description:

Available MBytes is the amount of physical memory available to processes running on the computer, in Megabytes, rather than bytes as reported in Memory\Available Bytes. It is calculated by adding the amount of space on the Zeroed, Free, and Stand by memory lists. Free memory is ready for use; Zeroed memory are pages of memory filled with zeros to prevent later processes from seeing data used by a previous process; Standby memory is memory removed from a process' working set (its physical memory) on route to disk, but is still available to be recalled.  This counter displays the last observed value only; it is not an average

 

Detailed: It show the amount of Physical Memory available during our load test execution.

 

SLA: At least 20% of Physical memory should be available.

 

2.      Committed Bytes

 

Description:

Committed Bytes is the amount of committed virtual memory, in bytes. Committed memory is the physical memory which has space reserved on the disk paging file(s). There can be one or more paging files on each physical drive. This counter displays the last observed value only; it is not an average.

Detailed: It shows the amount of Virtual memory used for load test execution

SLA: It can be allocated maximum of 80%

3.      Page Faults/Sec:

Description:

 

Page Faults/sec is a count of the Page Faults in the processor.  A page fault occurs when a process refers to a virtual memory page that is not in its Working Set in main memory.  A Page Fault will not cause the page to be fetched from disk if that page is on the standby list, and hence already in main memory, or if it is in use by another process with whom the page is shared.

 

Detailed: Whenever the required pages or not found in the paged memory is called as page faults

Hard Page Faults: The faulted pages are not found in physical memory so that it has to fetch  form the virtual memory.

 

Soft page faults: The faulted pages are found within the physical memory.

Hard Faults/sec=Page Reads/sec +Page Inputs/sec = 0.005+0.006 = 0.0011

Soft Faults/sec=Page Faults/sec-Hard Faults/sec = 112.69 -0.0011 = 112.688

 

             SLA: Always hard page faults/sec rate should be less than the soft page faults/sec.

 

4.      Page Reads/Sec:

 

Description:

Page Reads/sec is the rate at which the disk was read to resolve hard page faults. It shows the number of reads operations, without regard to the number of pages retrieved in each operation. Hard page faults occur when a process references a page in virtual memory that is not in working set or elsewhere in physical memory, and must be retrieved from disk. This counter is a primary indicator of the kinds of faults that cause system-wide delays. It includes read operations to satisfy faults in the file system cache (usually requested by applications) and in non-cached mapped memory files. Compare the value of Memory\Pages Reads/sec to the value of Memory\Pages Input/sec to determine the average number of pages read during each operation.

 

Detailed: It shows the number of read operations per sec to copy the pages from virtual memory to physical memory to resolve hard faults in the execution.

 

SLA: N/A

 

5.      Page Writes/Sec

 

Description:

Page Writes/sec is the rate at which pages are written to disk to free up space in physical memory. Pages are written to disk only if they are changed while in physical memory, so they are likely to hold data, not code.  This counter shows write operations, without regard to the number of pages written in each operation.  This counter displays the difference between the values observed in the last two samples, divided by the duration of the sample interval.

 

Detailed: It shows the number of pages that are to be written back to disk(Virtual Memory) after resolving the hard page faults to free up the memory in Physical memory(RAM).

 

SLA: N/A

 

6.      Pages Inputs/Sec

 

             Description:

Pages Input/sec is the rate at which pages are read from disk to resolve hard page faults. Hard page faults occur when a process refers to a page in virtual memory that is not in its working set or elsewhere in physical memory, and must be retrieved from disk. When a page is faulted, the system tries to read multiple contiguous pages into memory to maximize the benefit of the read operation. Compare the value of Memory\\Pages Input/sec to the value of Memory\\Page Reads/sec to determine the average number of pages read into memory during each read operation.

 

Detailed: It shows the no of read operations per sec to copy the pages from virtual memory to physical memory to resolve hard faults in the execution with regards to the multiple pages in each read operation.

 

SLA: N/A

 

 

7.      Pages Output/Sec

 

             Description:

Pages Output/sec is the rate at which pages are written to disk to free up space in physical memory. Pages are written back to disk only if they are changed in physical memory, so they are likely to hold data, not code. A high rate of pages output might indicate a memory shortage. Windows writes more pages back to disk to free up space when physical memory is in short supply.  This counter shows the number of pages, and can be compared to other counts of pages, without conversion.

Detailed: It shows the no of pages that are to be written back to the disk(Virtual Memory) after resolving the hard faults to free up the memory in Physical memory with regards to multiple pages in each write operation

SLA: N/A

8.      Pages/Sec

 

             Description:

Pages/sec is the number of pages read from the disk or written to the disk to resolve memory references to pages that were not in memory at the time of the reference.  This is the sum of Pages Input/sec and Pages Output/sec.  This counter includes paging traffic on behalf of the system Cache to access file data for applications.  This value also includes the pages to/from non-cached mapped memory files.  This is the primary counter to observe if you are concerned about excessive memory pressure (that is, thrashing), and the excessive paging that may result

 

SLA: N/A

 

9.       Pool Paged Bytes

Description:

Pool Paged Bytes is the size, in bytes, of the paged pool, an area of system memory (physical memory used by the operating system) for objects that can be written to disk when they are not being used.  Memory\Pool Paged Bytes is calculated differently than Process\Pool Paged Bytes, so it might not equal Process\Pool Paged Bytes\_Total. This counter displays the last observed value only; it is not an average.

SLA: N/A

 

10.  Pool Nonpaged Bytes

 

Description:

Pool Nonpaged Bytes is the number of bytes in the Nonpaged Pool, a system memory area where space is acquired by operating system components as they accomplish their appointed tasks.  Nonpaged Pool pages cannot be paged out to the paging files, but instead remain in main memory as long as they are allocated.

 

SLA: N/A

 

Disk:

11.  Avg Disk read queue length:

 

Description: It shows the number of requests that are queued at the hard disk while doing the read operations.

 

SLA: Always disk read queue length should be less than 2

 

Processor:

12.  %Processor Time

 

Description:

% Processor Time is the percentage of time that the processor is executing a non-Idle thread.  This counter was designed as a primary indicator of processor activity.  It is calculated by measuring the time that the processor spends executing the thread of the idle process in each sample interval, and subtracting that value from 100%.  (Each processor has an idle thread which consumes cycles when no other threads are ready to run). It can be viewed as the percentage of the sample interval spent doing useful work.  This counter displays the average percentage of busy time observed during the sample interval.  It is calculated by monitoring the time the service was inactive, and then subtracting that value from 100%.

Detailed: It shows the % of CPU used to run all the processes running on the system (System level, User Level, Network Level, Local Users etc..,)

      SLA: %Processor time should be less than are equal to 80% of CPU usage.

13.  %Idle Time

Description:

% Idle Time is the percentage of time the processor is idle during the sample interval

Detailed: It shows the %of CPU which is free or Idle during the test execution.

SLA: At least 20% of the CPU should be free or Idle during the load test execution (Cumulative of all the Processors)

14.  % Privileged Time

        Description:

% Privileged Time is the percentage of elapsed time that the process threads spent executing code in privileged mode.  When a Windows system service in called, the service will often run in privileged mode to gain access to system-private data. Such data is protected from access by threads executing in user mode. Calls to the system can be explicit or implicit, such as page faults or interrupts. Unlike some early operating systems, Windows uses process boundaries for subsystem protection in addition to the traditional protection of user and privileged modes. Some work done by Windows on behalf of the application might appear in other subsystem processes in addition to the privileged time in the process.

Detailed Des: It shows the percentage of CPU time utilized by all the system resources. (All the process running by the system mode)

SLA: Always % Privileged time should be <=40% (Cumulative of all the processes)

15.  % User Time

 

Description:

% User Time is the percentage of elapsed time the processor spends in the user mode. User mode is a restricted processing mode designed for applications, environment subsystems, and integral subsystems.  The alternative, privileged mode is designed for operating system components and allows direct access to hardware and all memory.  The operating system switches application threads to privileged mode to access operating system services. This counter displays the average busy time as a percentage of the sample time.

Detailed: It shows the % of CPU utilized to run all the user level processes

 

SLA: %User time should be <=40%

 

System

16.  Processor Queue Length

 

Description:                                                                

Processor Queue Length is the number of threads in the processor queue.  Unlike the disk counters, this counter counters, this counter shows ready threads only, not threads that are running.  There is a single queue for processor time even on computers with multiple processors. Therefore, if a computer has multiple processors, you need to divide this value by the number of processors servicing the workload. A sustained processor queue of less than 10 threads per processor is normally acceptable, dependent of the workload.

 

Detailed: It shows the queue length of all processor or CPU’s to keep all the request are in queue before processing

 

SLA: The Processor Queue length should be less than or equal to 10 for each processor

TCP

17.  Connection Failures

Description:                                                                                                    

Connection Failures is the number of times TCP connections have made a direct transition to the CLOSED state from the SYN-SENT state or the SYN-RCVD state, plus the number of times TCP connections have made a direct transition to the LISTEN state from the SYN-RCVD state.

SLA: N/A           

18.   Connection Active

Description:                                                                                                                            

Connections Active is the number of times TCP connections have made a direct transition to the SYN-SENT state from the CLOSED state. In other words, it shows a number of connections which are initiated by the local computer. The value is a cumulative total

SLA: N/A

19.  Connections Established

Description                                         

Connections Established is the number of TCP connections for which the current state is either ESTABLISHED or CLOSE-WAIT.

 

SLA: N/A