Financial companies operating trading platforms have extremely high connectivity requirements for their networks; market transactions can be won or lost based on milliseconds and just a few individual data packets. A momentary glitch in the network, such as a “burst” of traffic, congestion or latency, can cause packet loss that delays or loses market transactions. When it comes to high-frequency trading (HFT), this can have serious financial and legal repercussions for all parties involved.
The secret to preventing these issues is implementing a network monitoring infrastructure that can deliver accurate, real-time data with extremely high granularity. We’re talking millisecond range granularity. This presents its own unique set of challenges.
In an ideal scenario, IT will have monitoring tools in place to detect “bursts” that result in a network device or link receiving more traffic than it can pass on. Detecting these bursts allows IT to route the incoming traffic to other devices or ports to prevent packets from being dropped while they investigate and resolve the underlying issue. Over time, they can identify which applications or tools are causing problems, upgrade or optimize them, and conduct effective capacity planning to improve overall network performance.
But this doesn’t always happen the way it should. Limitations in network monitoring hardware, like network TAPs and packet brokers, are often the first stumbling block. Financial networks are typically on the cutting edge when it comes to speed and have already upgraded network segments (particularly within the data center) to 100Gbps. It is technically challenging to monitor packets at these speeds – they pass through the devices every 6.7 nanoseconds! In fact, packet brokers without special hardware assistance will not be able to process packets at these speeds at all. Thus, the first step for financial Network Operations (NetOps) teams is to be sure that their monitoring solutions can keep up with their network speeds.
The second major IT issue at play is that measuring packet data at one-second intervals (which is common) obscures very short bursts of traffic that are hard to detect, yet high enough to cause dropped packets and lost transactions. These “microbursts” disappear into the average traffic when measured at one-second resolution, preventing NetOps teams from identifying the culprit. In other industries, this could be overlooked – a bit of lag in a video call is annoying, but not business-critical. But financial companies must meet higher standards. Measuring packet data at the millisecond level can solve this issue by exposing these microbursts so that the NetOps team can investigate.
The third issue is getting an accurate analysis of what caused the microburst. Some measurement methods that rely on proxies, such as buffer utilization, can only tell NetOps that a link is spiking, or saturated. They don’t give any information on what caused the traffic spike, forcing NetOps to play a guessing game to find the root of the problem.
More advanced monitoring will take network traffic directly from the wire through passive optical TAPs and measure, in hardware, traffic utilization of user-specified profiles that include IP endpoints, VLAN tags, QoS bits and other Layer 2-Layer 4 parameters. It is possible to do this in real time for up to 1,000 feeds per link regardless of network traffic speed or packet mix. By seeing the profile or combination of profiles that make up the bursts, network engineers can move traffic between ports before it causes packet drops. Such detailed visibility also helps NetOps troubleshoot any high utilization events which could cause packet loss, identify problematic applications that may need to be optimized or upgraded, and plan network upgrades strategically.
The high pressure of financial networks requires IT and NetOps teams to be able to detect unexplained packet drops and troubleshoot network issues on-demand, in real time and at line rate speeds up to 100Gbps. Given the risk of allowing dropped packets and lost transactions, millisecond-level monitoring with detailed analysis of bursts and microbursts is essential for financial companies to maintain positive customer experience and maximize return on investment.
Pete Sevcik is the principal hardware architect at cPacket Networks