A Programmer’s Perspective on Life, the Universe, and the Network
In Part 1, A Programmer’s Perspective on the Current State of Network Tools, we left off asking what would happen if we developed NPM in the same way we develop movies and video games. In this new world of NPM, we would not even use terms like NPM, or network monitoring and troubleshooting anymore, because that is part of the problem.
Being as technical as we are, and dare I say less creative, we name things what they do and what they are like routers, networks, packets, protocols, standards, and RFCs. And instead of taking these low-level terms and giving them higher-level names associated with more interesting and forward-thinking metaphors so that we can write a story, we string a bunch of technical terms together, take the first character of each word, and call it an acronym, like NPM, ART, TCP, and every other term we use. And that’s it, we are done. We then hang our hat on that and say that we coined that term.
In Another World…
You may hate me for this comparison, but you know it’s true. And to be honest, it upsets me, because it did not have to be this way, and in some parallel universe things happened just a little bit differently, and networking and storytelling merged, and a whole generation of network engineers were the superheroes and the celebrities that they deserved to be. But alas, for it is not too late, and although we did have to sacrifice a few generations of lonely IT people in dark server rooms hacking away at a CLI to get where we are now, we can dig the NPM tools industry out of this rut, catch up to where it should be now, and launch it into a more interesting and visual future that everyone wants to be a part of.
And maybe we are not that far off. Many NPM solutions provide more modern web-based apps these days, which is a huge step forward. Just by being a web app, they are much more accessible and familiar to most kids right out of college, or even high school.
The Heart of the Story
So back to the college roommates, because people and their stories are the most important ingredients. The technology helps a lot, but even ancient game UI technology like Zelda would be a better basis for NPM analysis and visualization than what we have today. Why? Because the act of network monitoring and troubleshooting is really a story about people, who along the way bump into walls, find clues, open doors, solve problems, get stuck, go back, meet others, and ask questions. And these are the kinds of actions that we see video games based on, which are the kinds of actions that take place in the real world but are rarely built into the workflow of an NPM solution.
Back on Planet Earth…
In the real world, we would call it a mystery, and it would be suspenseful, thrilling, and most important of all it would evoke emotions. And because of all the emotions, we would remember all the details, and that is important because we know that emotions are the reason we remember things. And if you don’t know that, look it up on YouTube.
In a really good story, you are happy, sad, mad, and go in and out of all kinds of other emotional states. Even the transition between emotions is kind of a meta-emotion. Think about it, some of the best and most memorable movie scenes are the ones that make you laugh and cry in less than 2 minutes. That matters a lot (not the crying part) but being able to remember the details, because when it comes to network problem-solving there is a lot of states, and transitions between one state to another, and the more emotion we can create and associate with these, the more complex our internal state machine can become.
The Story of Network Troubleshooting
What does all that mean? It means we will be able to solve harder problems. And I am not saying that using today’s modern NPM solutions to monitor and troubleshoot problems cannot assist in some of this storytelling along the way, and provide a framework for storytelling, or workflow, or as some call it these days, just flow. Yes, the solution can help some, but it won’t tell the story or even lead you down a certain path. Nope, the storytelling part, which clearly I am arguing is the most important component, the glue that gets us from the problem, through the journey, and to the solution, is left as an exercise to the user, or the team.
And we have all seen it happen, and we know that special person or team, who in these rare moments gets caught up in the excitement of the hunt, chasing down a network problem, telling or creating a story along the way, and through creative dialog, a vivid imagination, and possibly even lots of hand waving (if the video camera was on), can propel the whole team, together, through the story, to arrive at a successful conclusion, the end of the story, and hopefully a fix.
If you are the lead performer during this story, then you are like the Dungeon Master, describing the scene, and writing the script as you ask questions, provide facts, roll the dice, and every so often take a leap of faith that may kill you (metaphorically speaking of course), or pivot you in the right direction.
Happily Ever After?
Along the way, you may find and use clues, or other data to correlate and generate further insight. And being the hero in many of these stories, and having saved the day many times before, you have seen it all and done it all, and have available to you the most powerful tool of all we call intuition. A leap of faith that you can take to skip past endless steps, like a wormhole, saving time and money for the company, and protecting the security, reliability, and performance of the network.
And like every story, this one has to end. And the sooner the better, since time is money. But who knows, the end of this story may be happy, like a simple QoS change to a router, or a firmware upgrade, or it may be sad, like sabotage or theft. But either way, you told the story, not the software, at least not any software I am familiar with yet.
So that was fun, and someday we will get there. But where is “there”, and is it the right metaphor, the idea of using video games and movies? Will it take us very far, or will we need to think of something else, like the way the brain works, space travel, or quantum theory? Maybe it will get us far enough along for now, and I think so in a way, but it occurs to me that movies and video games, once they are done, are cast in stone, and do not change.
Network Management: The Ultimate Road Trip
Networks on the other hand, once deployed, are very dynamic and alive. Ok maybe not alive, but they are more like the road and highways metaphor, and a place where the movies are made. I mean, how many great movies are about road trips? Like a million, or at least a couple thousand. The point I am trying to make is that we have a long way to go before the NPM software is going to tell the story for you. It is your story, and your ability to use the features of the NPM software will make it better, and hopefully easier, and take less time.
In the meantime, when you choose an NPM solution, look for the best analysis and visualization you can find. The more it is like a video game or movie, the better. LiveNX is more like a video game than any other NPM UI I have seen, and the stories I have seen and heard both SE’s and customers tell with it are amazing.
In LiveWire, the new Peermap in the OmniPeek for Web UI is so interactive and innovative that we are still learning new workflows and stories to tell with it. Also in LiveWire, the Flow Visualizer is used to tell the story of a flow, which is oftentimes very sad. I love using it to clearly show how the network ACKs are responding in milliseconds, while the application is taking forever to respond with the actual data, and providing hard evidence to either the programmers or the application vendors, so they can fix the problem.
In the analysis, look for alerts that can be used on both the monitoring and troubleshooting phases (chapters?) of the story. Lots of alerts are good, but the quality and depth of the alerts have to be looked at as well. For example, there are single packet alerts like “QoS is missing”. Useful, but not much depth. On the other hand, an alert like “QoS has changed”, has to maintain a state between packets, and says that the behavior of the network is now different. Also look for a solution that can integrate alerts from other sources, and correlate them.
Visualizing the Network
On the visualization side, look for ways to correlate these alerts with other types of data. It is this feature that gives the storyteller insight and the ability to make those leaps of faith. In some products, the monitoring and troubleshooting components are completely different products. This is fine and common, but make sure they are tightly integrated, so the flow from the monitoring phase to the troubleshooting phase can be seamlessly weaved together. This is another reason why having a web-based UI is a huge plus, allowing the different components of a solution to integrate and flow together better than if they are completely different non-web-based applications.
Speaking of integration, let’s talk about visualization integration. Your NPM will have all kinds of different views, and the more the merrier. There will surely be at least one peermap, and maybe a number of them. A peermap is a visualization that displays the nodes and the different ways in which they are related to each other.
The nodes are usually either IP addresses, physical addresses, ESSIDs, or some other field in the packet ( the “nodes” can also be presented as a higher-level construct like a flow, subnet, country, etc.). The relationships between the nodes are usually single lines connecting the nodes, and they can represent anything from utilization to latency, to QoS, etc. There can also be modes where there are multiple lines connecting the nodes. These lines can represent the applications or protocols flowing between the nodes, channels, data rates, codecs, etc.
Some peermaps may be static. Boring! But still, better than nothing. More advanced peermaps will be interactive, allowing you to move the whole graph, zoom in and out, move the nodes around, and search in many ways. Most of these graphs are 2D though. I am still looking forward to a 3D graph that has depth. Maybe the depth represents time, and the nodes with less recent activity are farther back? All kinds of possibilities there. But most important is the integration that the peermap, and other high-level visualization has to other views. This integration allows it to be part of the story, a means to an end, and not just the end.
Choosing an NPM
And since we are talking about an NPM, which is a network performance monitor, it better have at least one packet view, if not multiple. If it does not, you should keep looking. And if it has a packet view, it has to capture and analyze packets. Some NPM solutions rely on WireShark, while others have the packet view built-in. Ideally, the higher-level analysis and visualizations can be used to monitor, troubleshoot, and solve problems without having to look at the packets. But that is not always possible, so not only does the NPM need to include a packet view, it better be amazing, and again tightly integrated into all of the other views. Amazing is tough for a packet view, but it should at least list the packets, display the decode for each packet, and have a hex/ASCII view.
NPM solutions typically include probes that are deployed throughout the network to capture and analyze the network traffic. These probes can be hardware or software probes, and the bigger the network, the more probes you will need. In really large organizations, there can be large probes in the data centers, and many smaller probes at the edge networks.
At some scale, it becomes necessary for the NPM solution to provide a means to manage all of the probes from a single pane of glass, and ideally, that place is in the cloud. The LiveAction Device Management Server (DMS), is a great example. The first fully SaaS-based solution for managing all of the capture and analysis probes on the corporate network.
Find the Right Tool for Your Business
But most of all, when choosing an NPM solution, choose the right company with the right people for you and your team, and people you have access to before and after you purchase the solution. And who are the right people? Well, if you made it this far, you know the right people are the ones that tell the best stories. And in an NPM solutions company, these are the SE’s.
During a POC you will be working with one or more of them, and in my experience, these are the superheroes who are the best storytellers and problem solvers, and will not only make the solution look and perform the best during the POC but can help you be the most successful using the solution to create and tell the stories that will make you a hero throughout your own journey.
This all may sound daunting, and it can be. But the life of a superhero is never easy, and neither is the process of choosing the right NPM because there are so many factors to consider, and even villains, devil’s advocates, and naysayers along the way to contend with.
Trust the Process
The POC’s for choosing an NPM solution can take months and even years. Just that part can cost you some money. In my opinion, it is best to put some skin in the game, and buy the pilot. The vendor is going to give you more attention and respect than if you drag them along on a giant POC, not knowing if you are going to buy something or not. But most importantly, when you do make a choice and pick a solution, buy the training! This is possibly the most crucial piece of advice I can give you. Buy training time, buy whatever extended support they have, use support, and be nice.
Like I have been saying, the solution does not tell the story, people do, and SE’s are really good at it. So during all those Zoom and Webex meetings, get to know your SE’s, listen to their stories and record them. Of course, the products matter as well, and people come and go, so it is the right balance that matters. When you find the right balance for you and your organization, that is the company you want to partner with.
By: Chris Bloom, Lead Technical Engineer