|
This is Monitoring page for
ClusterGate.RU
|
- Lire -- The Lire log analyzer and report generator
- Zenoss -- Open Source Enterprise Monitoring. Zenoss
Core is an enterprise-grade network and systems monitoring product that
delivers the functionality IT operations teams need to effectively manage the
health and performance of their entire infrastructure through a single,
integrated package.
For far too long, robust IT infrastructure monitoring was out of reach for
most organizations because of the cost and complexity of the proprietary
systems that offered the required functionality. Zenoss has changed the game
by offering a complete, easy-to-use solution as a free (i.e. no money),
downloadable, open source software product.
- Big brother -- well known service/host monitoring system. Big Brother monitors System and Network-delivered services for availability. Your current network status is displayed on a color-coded web page in near-real time. When problems are detected, you're immediately notified by e-mail, pager, or text messaging.
- Nagios --
NagiosŪ is a host and service monitor designed to inform you of network
problems before your clients, end-users or managers do. It has been designed
to run under the Linux operating system, but works fine under most *NIX
variants as well. The monitoring daemon runs intermittent checks on hosts and
services you specify using external "plugins" which return status information
to Nagios. When problems are encountered, the daemon can send notifications
out to administrative contacts in a variety of different ways (email, instant
message, SMS, etc.). Current status information, historical logs, and reports
can all be accessed via a web browser.
- Ganglia -- Ganglia is a scalable
distributed monitoring system for high-performance computing systems such as
clusters and Grids. It is based on a hierarchical design targeted at
federations of clusters. It leverages widely used technologies such as XML for
data representation, XDR for compact, portable data transport, and RRDtool for
data storage and visualization. It uses carefully engineered data structures
and algorithms to achieve very low per-node overheads and high concurrency.
The implementation is robust, has been ported to an extensive set of operating
systems and processor architectures, and is currently in use on over 500
clusters around the world. It has been used to link clusters across university
campuses and around the world and can scale to handle clusters with 2000
nodes.
- Cricket -- is a high performance,
extremely flexible system for monitoring trends in time-series data. Cricket
was expressly developed to help network managers visualize and understand the
traffic on their networks, but it can be used all kinds of other jobs, as
well.
Cricket has two components, a collector and a grapher. The collector runs from
cron every 5 minutes (or at a different rate, if you want), and stores data
into a datastructure managed by RRD Tool. Later, when you want to check on the
data you have collected, you can use a web-based interface to view graphs of
the data.
Cricket reads a set of config files called a config tree. The config tree
expresses everything Cricket needs to know about the types of data to be
collected, how to get it, and from which targets it should collect data. The
config tree is designed to minimize redundant information, making it compact
and easy to manage, and preventing silly mistakes from occurring due to
copy-and-paste errors. Cricket is written entirely in Perl and is
distributed under the GNU General Public License.
- Zabbix -- ZABBIX is software for monitoring
of your applications, network and servers. ZABBIX supports both polling and
trapping techniques to collect data from monitored hosts. A flexible
notification mechanism allows easy and quickly configure different types of
notifications for pre-defined events. ZABBIX offers advanced monitoring,
alerting and visualisation features today which are missing in other
monitoring systems, even some of the best commercial ones. Use of industry
standards makes integration of ZABBIX into existing infrastructure
trouble-free.
- BOSS -- the name Batch Object Submission
System may give you wrong idea that the system is scheduler or something
like that. In reality it is job/task monitoring system (not service
monitoring). BOSS (Batch Object Submission System) provides an easy to use
book keeping system for jobs running on a Linux computing farm. Different job
types can be registered to the BOSS System, allowing the storage on a local
database of information specific to the task which is being performed by the
job itself. This information is used both for job monitoring and for
book-keeping.
- Monalisa -- MONitoring Agents using a Large Integrated Services Architecture. The MonALISA framework is a fully distributed service system with no single point of failure and it provides:
- Distributed Registration and Discovery for Services and Applications.
- Monitoring all aspects of complex systems :
- System information for computer nodes and clusters.
- Network information (traffic, flows, connectivity, topology) for
WAN and LAN.
- Monitoring the performance of Applications, Jobs or services.
- End User Systems, and End To End performance measurements.
- Can interact with any other services to provide in near real-time
- customized information based on monitoring information.
- Secure, remote administration for services and applications.
- Agents to supervise applications, to restart or reconfigure them, and to
- notify other services when certain conditions are detected.
- The Agent system can be used to develop higher level decision services,
- implemented as a distributed network of communicating agents, to perform
- global optimization tasks.
- Graphical User Interfaces to visualize complex information.
- Global monitoring repositories for distributed Virtual Organizations.
MonALISA is currently used in several large scale distributed system and
proved to be a reliable and scalable system.
- R-GMA: -- Integrated Applications Management, Server
Management, and Database Monitoring Software. Integrated Applications
Management, Server Management, and Database Monitoring Software. R-GMA is in
wide use in Grid like distributed systems.
- Test harness and reporting framework --
Inca is a flexible framework for the automated testing, benchmarking and
monitoring of Grid systems. It includes mechanisms to schedule the execution
of information gathering scripts and to collect, archive, publish, and display
data.
Originally developed for the TeraGrid project, Inca is a general framework
that can be adapted and used by other Grids. Inca offers a diverse set of use
cases including:
- Software Stack Validation & Verification
- Network Bandwidth Measurements
- Grid Benchmarking
- ManageEngine
-- professionsl monitoring/management tool. Integrated Applications
Management, Server Management, and Database Monitoring Software
- Lemon RRD framework --
Lemon RRD framework is a part of the Lemon project at CERN
(http://cern.ch/lemon) and is used to retrieve metric information from the MR
(Monitoring Repository) and store it into time series serializes aging data
structures that are stored as rrd files on a disk. These are integral part of
the RRDtool project (http://www.rrdtool.org) that we used for our purposes.
This is then passed over to the web interface for visualization. Framework is
generic enough to allow different source of data other than MR. LRF supports
grouping of machines (objects) into groups (clusters, racks, hardware
models,...) and provides summary or average overview of each group
independently even if certain machines are part of more of these groupings.
This is all provided already at the time of gathering of information from the
Monitoring Repository. The overview of the Lemon is available here and of the
Lemon RRD framework is here.
- sysstat package -- news,
information, documentation and links software for the sysstat utilities
created for Linux. The sysstat utilities are a collection of performance
monitoring tools for Linux. These include sar, sadf, mpstat, iostat and sa
tools.
- Bonnie++ -- is a benchmark suite
that is aimed at performing a number of simple tests of hard drive and file
system performance. Then you can decide which test is important and decide how
to compare different systems after running it.
The main program tests database type access to a single file (or a set of
files if you wish to test more than 1G of storage), and it tests creation,
reading, and deleting of small files which can simulate the usage of programs
such as Squid, INN, or Maildir format email.
- NetLogger Anyone who has ever tried to debug or
do performance analysis of complex distributed applications knows that it can
be a very difficult task. Problems may be in many various software components,
hardware components, networks, OS's, etc.
NetLogger is designed to make this easier. NetLogger is both a methodology
for analyzing distributed systems, and a set of tools to help implement the
methodology. In fact, you can use the NetLogger methodology without using any
of the LBNL provided tools.
- Iperf known tool for network measurement.
Iperf is a tool to measure maximum TCP bandwidth, allowing the tuning of
various parameters and UDP characteristics. Iperf reports bandwidth, delay
jitter, datagram loss.
- NetPerf is a
benchmark that can be used to measure the performance of many different types
of networking.
- Network MOnitoring tools --
large list of available monitoring/measuring tools (SLAC.STANFORD.EDU)
- IOzone good benchmark tool for file systems
IOzone is a filesystem benchmark tool. The benchmark generates and measures
a variety of file operations. Iozone has been ported to many machines and runs
under many operating systems. Iozone is useful for performing a broad
filesystem analysis of a vendor's computer platform.
Benchmark Features:
- ANSII C source
- POSIX async I/O
- Mmap() file I/O
- Normal file I/O
- Single stream measurement
- Multiple stream measurement
- Distributed fileserver measurements (Cluster)
- POSIX pthreads
- Multi-process measurement
- Excel importable output for graph generation
- Latency plots
- 64bit compatible source
- Large file compatible
- Stonewalling in throughput tests to eliminate straggler effects
- Processor cache size configurable
- Selectable measurements with fsync, O_SYNC
- Builds for: AIX, BSDI, HP-UX, IRIX, FreeBSD, Linux, OpenBSD, NetBSD,
- OSFV3, OSFV4, OSFV5, SCO OpenServer, Solaris, Windows95/98/NT
- Internet End-to-end Performance Monitoring not bad intro into the matter
- Distributed Systems Department at LBL (here is good
source of information on netowrking)
- Various measurement/taxonomy tools The CAIDA
Tools site contains CAIDA tools and software as well as a taxonomy of
available research and visualization tools.
- The list of measurement tools (SPEC, bonnie, TPC,
a range of kernel tools, etc.)
- Disk benchmarks -- the list of different
tools to do measurement on disk I/O.
- FIO is a tool that will spawn a number of
threads or processes doing a
particular type of io action as specified by the user. fio takes a
number of global parameters, each inherited by the thread unless
otherwise parameters given to them overriding that setting is given.
The typical use of fio is to write a job file matching the io load
one wants to simulate.
- Memtest86 -- A Stand-alone Memory Diagnostic.
Memtest86 is thorough, stand alone memory test for x86 architecture computers.
BIOS based memory tests are a quick, cursory check and often miss many of the
failures that are detected by Memtest86.
- BenchMarkHQ -- pretty large collection for
benchmark utilities (English and Russian)
- SysBench: a system performance benchmark.
SysBench is a modular, cross-platform and multi-threaded benchmark tool for
evaluating OS parameters that are important for a system running a database
under intensive load.
The idea of this benchmark suite is to quickly get an impression about system
performance without setting up complex database benchmarks or even without
installing a database at all.
Current features allow to test the following system parameters:
- file I/O performance
- scheduler performance
- memory allocation and transfer speed
- POSIX threads implementation performance
- database server performance (OLTP benchmark)
-
- CPU/Memory/Disk/System Tests -- many testing tools for
different parts of the system.
|
|