For network managers, there was great news coming with the VMware vSphere 5.0 release, in the form of enhancements to the Virtual Distributed Switch (vDS.) In a particular, the vDS now supports Port Mirroring (commonly known as SPAN) in addition to (finally!) formally offering NetFlow support.
This should go a long way in helping with one of the major issues that network engineers and managers have with virtual server environments – the loss of visibility into traffic that is going between virtual machines on the same host. I’ve written on this before, referring to it as “the fog of virtualization.” This blind spot has significantly hindered both understanding performance and troubleshooting of degradations. In an EMA research report that earlier this year, network managers called out performance and visibility as critical needs for managing in and around a virtual server infrastructure.
Application-aware network performance management vendors have been trying to deal with this in a variety of ways. Those who support NetFlow have been picking up and offering reports against the experimental NetFlow records coming out of the older versions of VMware’s vSwitches, but the jury has been mixed on whether or not that data is accurate enough to rely upon. Packet-based monitoring vendors have offered virtual probes and virtual taps that reside as VM appliances and rely upon promiscuous mode connections to the vSwitch. This is more accurate than the experimental NetFlow, but does require sysadmins to agree to deploying tools on their hypervisors that can pull significant CPU cycles. One vendor offering a virtual probe appliance even went so far as to recommend a dedicated CPU core be assigned in order to assure full functionality.
These prior options have been good enough for some people, and many have experimented with them, but they have fallen short of mainstream status in part due to the reasons mentioned above. When I talk to networking pros, many of them tell me they have been pinning their hopes on third-party virtual switch technology, in particular Cisco’s Nexus 1000V, as the best opportunity to solve this issue. The Nexus 1000V offers traditional switch monitoring capabilities such as SPAN and NetFlow. The biggest drawback here has been cost – the 1000V is not bargain priced. Still, at Cisco Live! in Las Vegas two months ago, I was pleased to hear most people saying that they had deployed the Nexus 1000V in production, whereas the year prior no one had moved it out of their test labs.
One other innovative option is offered by Net Optics, with their Phantom Virtual Tap (and just last week the newly announced Phantom HD). Net Optics has taken the approach of tying directly into the hypervisor kernel rather than pulling data out of the vSwitch and into a VM. This has advantages in terms of reducing load on the switching function, but may not be palatable to those concerned about optimizing overall hypervisor/system performance. Still, Net Optics raises a good counterpoint to the vDS SPAN feature, pointing out that it may also represent a significant processing load and loss of bandwidth which in itself may be deemed unacceptable.
On the NetFlow front, the new vDS supports NetFlow version 5 format. That means data from this viewpoint can be integrated into virtually every commercial or open source NetFlow collection/analysis tool. This will be most helpful in providing basic flow monitoring and a quick understanding of the aggregate activity between the VMs on each host. Additionally, NetFlow is commonly used to track activity for security purposes, and this will help with recognizing unusual activity between VMs. But NetFlow v5 does suffer some limitations in terms of its ability to be used for definitive troubleshooting. It will be interesting to see if VMware continues to push forward towards more current versions of NetFlow, such as v9 and IPFIX, which can include much more detailed flow data through flexible templates.
Overall, I’m very excited by the progress being made here. The new vDS features just may represent *the* visibility answer for the broader masses. While many will still look to the Nexus 1000 V or virtual taps and probes, the vDS now becomes a standard part of the VMware architecture. And consequently, traditional mainstream monitoring will now be possible out of the box.
Finally, perhaps, the fog of virtualization is lifting…..