Before we get started, it might be useful to share some introductory details on vCOPS (vCenter Operations Manager). It can be installed as a plug-in in your current vSphere environment. After some minimal configuration it will be up and running. vCOPS works best after it has been in the environment for at least a few weeks. Its usefulness does not necessarily lie in finding outright errors (although it can do that), but in finding anomalies in your environment. It “learns” the environment and can point out what is out of the norm. There are three core scores that are given on the main dashboard as shown in Figure A.
Figure A
VMware calls these core elements badges. There’s the health badge that shows immediate problems, the risk badge that shows future problems, and the efficiency badge that shows opportunities to optimize. There are then subcategories under each of these badges which contribute to the scores.
In the live demo, they offered up three scenarios to show the value of the tool. The first showed how to find what’s causing the slow performance of a workload as shown in the steps below.
- In the search field found in the upper-right corner, type in the name of the VM that is slow.
- Under the Alerts pane there is an option to filter by workload. Click on the workload filter.
- Find the workload alert and then click on it.
- From here you can see the symptoms, such as heavy disk I/O.
- Now click on the Operation Tab.
- Check out the Workload section and you can see that the datastore “skittle” (icon representing the datastore) is red.
- Click on the datastore skittle and click details.
- Click on the Analysis tab and select Storage as a focus area then filter by VM.
- You can see that the color is based on latency, and if your problem is storage latency you’ll see it in here.
- You can deduce that you either need faster storage or more spindles because the current datastore can’t handle the VM workload.
- Search for the problematic VM by name.
- By looking at the dashboard you will be able to see the workload is very high and it’s hitting the memory pretty hard. Although you can see this right in the dashboard, you may want to see if this is a common issue with this machine.
- Click on the Planning Tab. then click on the Stress badge
- Here you’ll be able to see how much of the time memory has been undersized. So if your memory is showing that it’s been undersized for 80% of the time, it may be time to add more memory!
- Search for the problematic VM
- Click on the red VM skittle under the Operations tab.
- Click on the host of the VM and you’ll be able to see the CPU is showing as red.
- Click on the CPU
- Click on the Events tab
- There is a time window that can be changed if necessary. You’ll want to look for the time where the graph changes from green to red. This is most likely when the change was made to your VM.
- Drill in to where the change is and look at the events list.
- In the events list, you’ll be able to see if something was installed, like an antivirus. This installation or restart, etc., will most likely be the reason the CPU kept spiking and started showing red in vCOPS. As mentioned above, if you combine this with vCenter Configuration Manager, you can actually find out which user made the change and roll it back in one click!