Performance Counters
Performance Counters are used to provide information as to how well the operating system or an application, service, or driver is performing. The counter data can help determine system bottlenecks and fine-tune system and application performance. The operating system, network, and devices provide counter data that an application can consume to provide users with a graphical view of how well the system is performing. Applications can also use counter data to determine how much system resources to consume.
Below mentioned are the few performance counters and their description for better understanding.
Network Counters
The network performance counters are not typically installed. The Network Segment object that is referred to here is installed when the Network Monitor Agent is installed. The network interface is installed when the SNMP service is installed. Many of the counters have to do with TCP/IP components, such as the SNMP service which relies on TCP/IP.
- Network Interface : Bytes Sent/sec. This is how many bytes of data are sent to the NIC. This is a raw measure of throughput for the network interface. We are really measuring the information sent to the interface which is the lowest point we can measure. If you have multiple NIC, you will see multiple instances of this particular counter.
- Network Interface: Bytes Received/sec. This, of course, is how many bytes you get from the NIC. This is a measure of the inbound traffic In measuring the bytes, NT isn’t too particular at this level. So, no matter what the byte is, it is counted. This will include the framing bytes as opposed to just the data.
- Network Interface : Bytes Total/sec. This is simply a combination of the other two counters. This will tell you overall how much information is going in and out of the interface. Typically, you can use this to get a general feel, but will want to look at the Bytes Sent/sec and the Bytes Received/sec for a more exact detail of the type of traffic.
- Processor : % DPC Time. Interrupts can be handled later. These are called Deferred Procedure Calls. You will want to keep track of these as well. The combination of this time with the % Interrupt Time will give you a strong idea of how much of the precious processor time is going to servicing the network.
- Processor : DPCs queued/sec. This will give you the rate at which DPC are being sent to the process queue. Unlike the Processor Queue Length and the Disk Queue Length, this value only shows you the rate at which the DPCs are being added to the queue, no how many are in the queue. Still, observing this value can give you an indication of a growing problem.
- Network Segment : %Broadcasts. This value will let you know how much of the network bandwidth is dedicated to broadcast traffic. Broadcasts are network packets that have been designated as intended for all machines on the segment. Often, it is this type of traffic that has a detrimental affect on the network.
- Network Segment : %Multicasts. This is a measure of the % of the network bandwidth that multicast traffic is taking. Multicast traffic is similar to the broadcast, however, there is a limited number of intended recipients. The idea was that if you can identify multiple recipients you can reduce the repetitive transmission of data. This type of transfer is used most commonly with video conferencing.
- TCP : Segments Sent/sec. This is the rate at which TCP segments are sent. This is how much information that is being sent out for TCP/IP transmissions.
- TCP : Segments Received/sec. Of course, the rate at which segments are received for the protocol.
- TCP : Segments/sec. This is just the total of the previous two counters. This is the information being sent and received. This is a general indication of how busy the TCP/IP traffic is. The segment size is variable and thus, this does not translate easily to bytes.
- TCP : Segments Retransmitted/sec. This is the rate at which retransmissions occur. Retransmissions are measured based on bytes in the data that are recognized as being transmitted before. On a Ethernet/TCP/IP network retransmissions are a fact of life. However, excessive retransmissions indicate a distinct reduction in bandwidth.
- TCP : Connection Failures. This is the raw number of TCP connections that have failed since the server was started. A failure usually indicates a loss of data somewhere in the process. Data lose can occur at many locations. This could be an indication of another device being down, or problems with the client-side configuration of the software.
- TCP : Connections Reset. This is typically a result of a timeout as opposed to an erroneous set of information. The reset results from the a lack of any information over a period of time.
- TCP : Connections Established. This counter represents them number of connections. Unlike the other two this is more and instantaneous counter of how many TCP connections are currently on the system as opposed to a count of the number of successful connections.
Disk Counters
The Disk Performance counters help you to evaluate the performance of the disk subsystem. The disk subsystem is more than the disk itself. It will include to disk controller card, the I/O bus of the system, and the disk. When measuring disk performance it is usually better to have a good baseline for performance than simply to try and evaluate the disk performance on a case by case basis. There are two objects for the disk—Physical Disk and Logical Disk. The counters for the two are identical. However, in some cases they may lead to slightly different conclusions. The Physical Disk object is used for the analysis of the overall disk, despite the partitions that may be on the disk. When evaluating overall disk performance this would be the one to select. The Logical Disk object analyzes information for a single partition. Thus the values will be isolated to activity that is particularly occurring on a single partition and not necessarily representative of the entire load that the disk is burdened with. The Logical Disk object is useful primarily when looking at the affects or a particular application, like SQL Server, on the disk performance. Again the Physical Disk is primarily for looking at the performance of the entire disk subsystem. In the list that follows, the favored object is indicated with the counter. When the Logical Disk and Physical Disk objects are especially different, the counter will be listed twice and the difference specifically mentioned.
- Physical Disk: Current Disk Queue Length. This counter provides a primary measure of disk congestion. Just as the processor queue was an indication of waiting threads, the disk queue is an indication of the number of transactions that are waiting to be processed. Recall that the queue is an important measure for services that operate on a transaction basis. Just like the line at the supermarket, the queue will be representative of not only the number of transactions, but also the length and frequency of each transaction.
- Physical Disk : % Disk Time. Much like % Processor time, this counter is a general mark of how busy the disk is. You will see many similarities between the disk and processor since they are both transaction-based services. This counter indicates a disk problem, but must be observed in conjunction with the Current Disk Queue Length counter to be truly informative. Recall also that the disk could be a bottleneck prior to the % Disk Time reaching 100%.
- Physical Disk : Avg. Disk Queue Length. This counter is actually strongly related to the %Disk Time counter. This counter converts the %Disk Time to a decimal value and displays it. This counter will be needed in times when the disk configuration employs multiple controllers for multiple physical disks. In these cases, the overall performance of the disk I/O system, which consists of two controllers, could exceed that of an individual disk. Thus, if you were looking at the %Disk Time counter, you would only see a value of 100%, which wouldn’t represent the total potential of the entire system, but only that it had reached the potential of a single disk on a single controller. The real value may be 120% which the Avg. Disk Queue Length counter would display as 1.2.
- Physical Disk : Disk Reads/sec. This counter is used to compare to the Memory: Page Inputs/sec counter. You need to compare the two counters to determine how much of the Disk Reads are actually attributed to satisfying page faults.
- Logical Disk : Disk Reads/sec. When observing an individual application (rather a partition) this counter will be an indication of how often the applications on the partition are reading from the disk. This will provide you with a more exact measure of the contribution of the various processes on the partition that are affecting the disk.
- Physical Disk : Disk Reads Bytes/sec. Primarily, you’ll use this counter to describe the performance of disk throughput for the disk subsystem. Remember that you are generally measuring the capability of the entire disk hardware subsystem to respond to requests for information.
- Logical Disk : Disk Reads Bytes/sec. For the partition, this will be an indication of the rate that data is being transferred. This will be an indication of what type of activity the partition is experiencing. A smaller value will indicate more random reads of smaller sections.
- Physical Disk : Avg. Disk Bytes/Read. This counter is used primarily to let you know the average number of bytes transferred per read of the disk system. This helps distinguish between random reads of the disk and the more efficient sequential file reads. A smaller value generally indicates random reads. The value for this counter can also be an indicator of file fragmentation.
- Physical Disk : Avg. Disk sec/Read. The value for this counter is generally the number of seconds it takes to do each read. On less-complex disk subsystems involving controllers that do not have intelligent management of the I/O, this value is a multiple of the disk’s rotation per minute. This does not negate the rule that the entire system is being observed. The rotational speed of the hard drive will be the predominant factor in the value with the delays imposed by the controller card and support bus system.
- Physical Disk: Disk Reads/sec. The value for this counter is the number of reads that the disk was able to accomplish per second. Changes in this value indicate the amount of random access to the disk. The disk is a mechanical device that is capable of only so much activity. When files are closer together, the disk is permitted to get to the files quicker than if the files are spread throughout the disk. In addition, disk fragmentation can contribute to an increased value here.
Memory Counters
The following performance counters all have to do with the management of memory issues. In addition, there will counters that assist in the determination of whether the problem you are having is really a memory issue.
- Memory : Page Faults/sec. This counter gives a general idea of how many times information being requested is not where the application (and VMM) expects it to be. The information must either be retrieved from another location in memory or from the pagefile. Recall that while a sustained value may indicate trouble here, you should be more concerned with hard page faults that represent actual reads or writes to the disk. Remember that the disk access is much slower than RAM.
- Memory : Pages Input/sec. Use this counter in comparison with the Page Faults/sec counter to determine the percentage of the page faults that are hard page faults.
Thus, Pages Input/sec / Page Faults/sec = % Hard Page Faults. Sustained values surpassing 40% are generally indicative of memory shortages of some kind. While you might know at this point that there is memory shortage of some kind on the system, this is not necessarily an indication that the system is in need of an immediate memory upgrade.
- Memory : Pages Output/sec. As memory becomes more in demand, you can expect to see that the amount of information being removed from memory is increasing. This may even begin to occur prior to the hard page faults becoming a problem. As memory begins to run short, the system will attempt to first start reducing the applications to their minimum working set. This means moving more information out to the pagefiles and disk. Thus, if your system is on the verge of being truly strained for memory you may begin to see this value climb. Often the first pages to be removed from memory are data pages. The code pages experience more repetitive reuse.
- Memory : Pages/sec. This value is often confused with Page Faults/sec. The Pages/sec counter is a combination of Pages Input/sec and Pages Output/sec counters. Recall that Page Faults/sec is a combination of hard page faults and soft page faults. This counter, however, is a general indicator of how often the system is using the hard drive to store or retrieve memory associated data.
- Memory : Page Reads/sec. This counter is probably the best indicator of a memory shortage because it indicates how often the system is reading from disk because of hard page faults. The system is always using the pagefile even if there is enough RAM to support all of the applications. Thus, some number of page reads will always be encountered. However, a sustained value over 5 Page Reads/sec is often a strong indicator of a memory shortage. You must be careful about viewing these counters to understand what they are telling you. This counter again indicates the number of reads from the disk that were done to satisfy page faults. The amount of pages read each time the system went to the disk may indeed vary. This will be a function of the application and the proximity of the data on the hard drive. Irrelevant of these facts, a sustained value of over 5 is still a strong indicator of a memory problem. Remember the importance of “sustained.” System operations often fluctuate, sometimes widely. So, just because the system has a Page Reads/sec of 24 for a couple of seconds does not mean you have a memory shortage.
- Memory : Page Writes/sec. Much like the Page Reads/sec, this counter indicates how many times the disk was written to in an effort to clear unused items out of memory. Again, the numbers of pages per read may change. Increasing values in this counter often indicate a building tension in the battle for memory resources.
- Memory : Available Memory. This counter indicates the amount of memory that is left after nonpaged pool allocations, paged pool allocations, process’ working sets, and the file system cache have all taken their piece. In general, NT attempts to keep this value around 4 MB. Should it drop below this for a sustained period, on the order of minutes at a time, there may be a memory shortage. Of course, you must always keep an eye out for those times when you are simply attempting to perform memory intensive tasks or large file transfers.
- Memory : Nonpageable memory pool bytes. This counter provides an indication of how NT has divided up the physical memory resource. An uncontrolled increase in this value would be indicative of a memory leak in a Kernel level service or driver.
- Memory : Pageable memory pool bytes. An uncontrolled increase in this counter, with the corresponding decrease in the available memory, would be indicative of a process taking more memory than it should and not giving it back.
- Memory : Committed Bytes. This counter indicates the total amount of memory that has been committed for the exclusive use of any of the services or processes on Windows NT. Should this value approach the committed limit, you will be facing a memory shortage of unknown cause, but of certain severe consequence.
- Process : Page Faults/sec. This is an indication of the number of page faults that occurred due to requests from this particular process. Excessive page faults from a particular process are an indication usually of bad coding practices. Either the functions and DLLs are not organized correctly, or the data set that the application is using is being called in a less than efficient manner.
- Process : Pool Paged Bytes. This is the amount of memory that the process is using in the pageable memory region. This information can be paged out from physical RAM to the pagefile on the hard drive.
- Process : Pool NonPaged Bytes. This is the amount of memory that the process is using that cannot be moved out to the pagefile and thus will remain in physical RAM. Most processes do not use this, however, some real-time applications may find it necessary to keep some DLLs and functions readily available in order to function at the real-time mode.
- Process : Working Set. This is the current size of the memory area that the process is utilizing for code, threads, and data. The size of the working set will grow and shrink as the VMM can permit. When memory is becoming scarce the working sets of the applications will be trimmed. When memory is plentiful the working sets are allowed to grow. Larger working sets mean more code and data in memory making the overall performance of the applications increase. However, a large working set that does not shrink appropriately is usually an indication of a memory leak.