|
OpenNMS Helps Keep Tabs On Networks
Introducing OpenNMSOpenNMS helps network administrators track machine uptime, outage and usage information so they can keep their operations healthy. In the late '90s one of the big buzzword phrases was service level agreement (SLA). Users entered into contracts wherein service providers (network services) were bound to supply a certain level of computing availability. IT departments and vendors were required to adhere to the SLA, or risk monetary penalties. Network management software was born to monitor network resources, track uptime and automatically alert someone, when a resource went down. The idea was to automate many of the mundane monitoring functions and free up the network administrators for more important jobs...like fixing network problems. Service level agreements were a definite impetus for network management software. Today, there are several open source and commercial network management software packages available, including Big Brother, Nagios, HP's OpenView. OpenNMS is an open source package that has not only been successful as a tool to help network managers run their networks, but also as an example of how open source software can be leveraged to create a service business. The latter will be the focus of this article.
Modest BeginningsIn 2000, OpenNMS was started (Project #4141 on Sourceforge) and was maintained by Oculan, Inc. Tarus Balog joined Oculan in September of 2001 to build a services business around the application. The OpenNMS version 1.0 was released in the spring of 2002 with four paying support customers. According to Balog, OpenNMS has three main functional areas:
Managing these tasks and tracking the activity of network resources has proven to be a complex job, even for an automated system.
The Balancing ActEarly on Balog was intrigued with the notion of five-nines of availability. Many of the service level agreements that he'd seen insisted that the system availability should be held to 99.999% uptime, or around 30 seconds of down-time per month. This seemed strange to him, since the widely used commercial HP OpenView tool only polled machines every five minutes. This means that the shortest outage is five minutes long; much more than 30 seconds and even the 4.5 minutes allowed by 99.99% uptime. Polling time also became a big issue as the number of machines increased. "Managing ten to a hundred machines is easy", remarked Balog. It's more challenging as you increase the number of machines. The OpenNMS solution uses a "down-time model" that temporarily increases the polling interval when an outage is detected (it changes to 30 seconds by default). This model allows customers to strike a reasonable balance between performance and capability. OpenNMS can currently monitor up to 20,000 devices from a single instance. Since the software is open source, capable coders can add more instances if needed. Data collection tends to throttle performance and capability. One of Balog's original four customers, Rackspace, couldn't collect data fast enough. To mitigate the problem OpenNMS was modified to collect 200,000 data points from approximately 24,000 interfaces every five minutes, or 2.4 million data points an hour from a single instance of OpenNMS. The limitation turned out to be the speed at which the disk controller could write the data, not OpenNMS itself. Making almost every aspect of OpenNMS configurable, allows for easy customer tweaking of their own systems. Flexible configurations also let Balog provide personalized services to customers that want an optimized turnkey solution to their network monitoring and notification systems systems.
Open Source ServiceProviding a competitive and viable alternative open source network management system is only part of Balog's efforts to keep OpenNMS going. Quality software, for customers, requires quality programmers to do the coding. Balog's solution to getting good people was to create the "Order of the Green Polo" (OGP) as a way to distinguish OpenNMS programmers from everyone else. Programmers with significant voluntary contributions to OpenNMS are inducted into the group and receive that special, exclusive polo shirt. Kind of like the elusive "Green Jacket" in golf. His "Order of the Green Polo" idea will soon pay off. Balog is planning to hire five new programmers, this next year, to work on the software. You have three guesses as to where those programmers will come from and the first two don't count. Perhaps other projects should take note of the technique. When asked about the green shirts and web page theme, Balog said it really came down to him just liking the color green. Interestingly, in the network management world, green tends to equal good. Other colors, like red mean bad things are happening. Seems like green is a natural. Another successful technique Balog has used to leverage open source software is the idea of multi-level service. You can spend a little or spend a lot, depending on how much OpenNMS application support you need. $3,295 will get a customer the "Getting to Know You" package. This 2 day consulting engagement provides a way to get started with or evaluate the OpenNMS product. There is no support (after the 2 days, of course) for this package. Customers might consider the "No Worries" package. Starting at $995 a month Balog and his associates will remotely run a customer's network management system for them. This plan let's the client act on outages, stats and system notifications without having to know the OpenNMS application. The system can be administered over a VPN or by using SSH over the network. Several other plans are available. Details are on the OpenNMS pricing plan page. Readers should be aware that there are actually two separate sites associated with OpenNMS and overseen by Balog. The commercial site represents the consulting and service business for customer network management. The open source project site maintains all development with full transparency on SourceForge. All code is published under the GPL.
Into The FutureBalog is optimistic about the future and OpenNMS. During our interview, his enthusiasm for the job was apparent. He confessed that he could go on and on about Linux and network management systems. With 35 steady customers in Singapore, France, England, and the US and 4000 downloads of the software a month, he has good reason to be positive. Rob Reilly is a consultant, writer, and commentator who advises clients on business & technology projects. His Linux, personal branding, and public speaking skills-related articles regularly appear in various high-end Linux and business media outlets. Send him a note or visit his Web site at http://home.earthlink.net/~robreilly.
|