OpenNMS® Installation Guide (2024)

Chapter7.Troubleshooting an OpenNMS Installation

Common Installation Issues

The following section contains advice for overcoming common installation issues. If your issue is not addressed below, please see the section on where to get help.

Dependency Problems

To assist with code management, the easiest way to install OpenNMS is via packages. Every effort has been made to insure that the packages OpenNMS depends on are required before the OpenNMS package can be installed. You should be able to find those packages on the distribution CDs that came with your system. For some of the more obscure packages, you can visit the OpenNMS FTP site and check in the /pub/dependencies directory. In addition, sites like Ibiblio and FreshRPMs are also good sources.

Error: "Started OpenNMS, but it has not finished starting up"

This can happen for a a number of reasons. You can run "opennms -v status" a few times after getting this error to see if OpenNMS eventually starts itself completely and if not, to see which daemons never start up completely. Here are some of the likely causes of this problem:

  1. OpenNMS takes a while to startup. This can happen on larger installations and when this happens "opennms -v status" will eventually show that all services have started up. By default, the startup script will try 10 times to see if OpenNMS has started and will wait 5 seconds between each try. You can increase the number of times by creating $OPENNMS_HOME/etc/opennms.conf and adding a line like "START_TIMEOUT=20" to double the number of times it tests. You can set the value to 0 to have the startup script not wait for OpenNMS to start.

  2. Database is not running. If only about half or less of the daemons are shown as running, you can check for this condition by looking for FATAL errors in the log files. You'll see something like "Error accessing database" in the logs.

  3. Dhcpd doesn't start. See the item in the next section.

  4. JNI library problem. OpenNMS uses a few native C libraries that are accessed using JNI (Java Native Interface). Normally they just work, except users have started seeing problems when running Linux in native AMD64 mode where they end up using a 32-bit (x86) version of Java and a 64-bit (AMD64) version of the JNI libraries, or vice-versa. If you have this problem, you might want to try switching your version of Java from 32-bit to 64-bit or in the other direction.

  5. Other. If the OpenNMS is installed, and the packages were not forced in using options like "--nodeps", the application should run just fine. If not, OpenNMS has a robust logging facility. Change to the logs directory (usually /var/log/opennms) and search the logs, using grep or your tool of choice, for words like FATAL and ERROR (the two highest log severities). Those events should give you clues as to why OpenNMS is not working.

DHCP Poller Won't Start

The OpenNMS DHCP poller will fail to start most operating systems (Linux, in particular) if you are running a DHCP client on the OpenNMS server. You'll see this by running "opennms -v status" and seeing everything in the running state, except for Dhcpd. The solution is to edit $OPENNMS_HOME/etc/service-configuration.xml and comment-out the "<service>...</service>" stanza for Dhcpd. For example, this is what the section would look like after modification to disable Dhcpd:

 <!-- Commented out since we have a DHCP client on this server <service> <name>OpenNMS:Name=Dhcpd</name> <class-name>org.opennms.netmgt.dhcpd.jmx.Dhcpd</class-name> <invoke pass="1" method="start"/> <invoke at="status" pass="0" method="status"/> <invoke at="stop" pass="0" method="stop"/> </service> -->

We discourage the running of OpenNMS on a server that is a DHCP client, both because OpenNMS may not be able to monitor DHCP servers on the network, and it is important that the monitoring server have a static IP address for receiving traps and to be reliant on as few network services as possible.

Error: "runjava: Could not find an appropriate JRE"

The runjava program is used to locate a suitable JRE for OpenNMS at install time that will be used for the installer and also for running OpenNMS after installation. See the section earlier in this manual on installing Java for OpenNMS. If you installed Java in a location that runjava cannot find, you can use its "-f" option to specify the JRE you want OpenNMS to use.

Error: "The database server's error messages are not in English ..."

You either need to set "lc_messages = 'C'" in your postgresql.conf file and restart PostgreSQL or upgrade to PostgreSQL 7.4 or later.

The installer does not always verify that an operation will succeed before executing the operation (e.g.: dropping database functions). In this case, it catches the exceptions returned from the database and checks the exception to see if it is an "okay" exception that should be ignored (e.g.: if the database function does not exist when attempting to drop a function).

In PostgreSQL 7.4 and later, a new client/server protocol is used (version 3, to be specific) that provides specific error codes intended for programmatic evaluation and we use these error codes if the server provides them. However for PostgreSQL versions before 7.4, we require that the database server error language be in English (the 'C' locale) so that we can parse the text error messages. If you are not running PostgreSQL 7.4 or newer, the installer executes a bogus query against the database and checks for an expected result in English.

Error: "Column X in new table has NOT NULL constraint ..."

This is a warning that the installer might not update tables successfully. Make sure that your database is backed up, and run the installer again with the "-N" option to ignore this check.

As an attempt to ensure that the install will complete successfully, a check is done to see if there might be any rows with NULL columns that might be inserted into a column in an upgrade table with a NOT NULL constraint. This usually happens when a previous run of the installer failed, or might be due to modifications to the database schema or a really old version of the schema.

Error: "One or more backup tables from a previous install still exists"

When the installer runs to upgrade the OpenNMS database from a previous install, it often updates table schemas. When it does this, it copies the data in a table to a temporary table (e.g.: the contents of node are copied into node_old_11033991291234). The original table is deleted, the new version of the table is created, the data in the temporary table is translated into the new table, and finally the temporary table is deleted.

Unfortunately, the installer cannot check for all problems that might break translation, so sometimes the translation step fails. In this case, the installer "reverts" the table it was processing by dropping the new table and moving the temporary table into its place.

Reverting the table in case of a problem is all good and well, but sometimes even it does not work properly, especially with older versions of the Java installer. If this happens, the temporary table (the one with "_old_" in it) is left with all of the old data. Until OpenNMS 1.1.5, this problem would not be caught the next time you ran the installer. The installer would see that you did not have the node table, for example, and happily continue to create a new one for you. This is bad, especially since you probably still have data that you care about that is now in the "old" table.

If you get this error, you will want to get rid of the table(s) containing "_old_", however you want to first check if they contain data. For example, if you have a single table, node_old_11033991291234, no other node_old_* tables, and no node table, you can simply rename the table:

# psql -h localhost -U opennms opennmsWelcome to psql 7.4.6, the PostgreSQL interactive terminal.Type: \copyright for distribution terms \h for help with SQL commands \? for help on internal slash commands \g or terminate with semicolon to execute query \q to quitopennms=# ALTER TABLE node_old_11033991291234 RENAME TO node;

You can use the "\d" command within psql to see what other tables exist in your database. You can use "SELECT count(*) from table;" (fill in the table name for "table") to get a count of rows in any table. If you have empty tables, you can just drop them. If you have multiple tables with data, you will have to decide which table of data you want to keep or merge them. This is left as a (not so simple) exercise for the reader.

Error: "Table X contains N rows (out of M) that violate new constraint Y"

Over time OpenNMS extends its database schema to improve functionality. This error can happen because of the way certain administrative functions in older versions of OpenNMS functioned or if the database was modified outside of OpenNMS (the latter is common for larger sites). Over time OpenNMS has introduced additional foreign key constraints on its database. These are used to ensure internal database consistency when data in two tables are tied together by a shared key. For example, each event can have a pointer to the node that it is related to; there is a foreign key constraint that requires that an event must not point at a node that does not exist.

Starting with 1.1.5, when we upgrade the database schema, we first check for rows that violate any new foreign key constraints that might be added. There are three options to to fix these errors:

  1. Remove the offending rows. This is suggested if the number of rows that violate the constraint is small in comparison to the total number of rows in the affected table and if you don't need the data. Use "$OPENNMS_HOME/bin/install -C <constraint> -X" to delete the offending rows.

  2. Mark the key in the offending rows to NULL. This is suggested if you need to keep the data around or are not yet sure about what to do with it. Use "$OPENNMS_HOME/bin/install -C <constraint>" to mark the key column to NULL in the offending rows.

  3. Fix the key in the offending rows. This is for advanced users and requires a good amount of effort. This is left as an exercise for the reader.

Error: "- adding iplike database function... <snip> org.postgresql.util.PSQLException: ERROR: could not access file '<snip>/lib/iplike.so': Permission denied"

The PostgreSQL server cannot access the iplike.so file. This could be due to the file itself not having appropriate permissions for the user that PostgreSQL runs as and/or one or more of the parent directories of the iplike.so not having appropriate permissions.

This error is seen even when running the installer as root because it is not OpenNMS nor the installer that cannot access the iplike.so file, but the PostgreSQL database. The installer instructs the PostgreSQL database to load the iplike.so and the PostgreSQL database server usually runs as a non-root user, so it is subject to filesystem access control checks like any other normal user. This is commonly seen when people install OpenNMS into a home directory for root or another user and the permissions on that home directory do not allow users other than the owner of the directory access.

Error: "- adding iplike database function... <snip> org.postgresql.util.PSQLException: ERROR: could not load library ..."

The latter part of the error could be something like "<path>/iplike.so: cannot open shared object file: No such file or directory" or "ld.so.1: postgres: fatal: <path>/iplike.so: wrong ELF class: ELFCLASS32".

The PostgreSQL server cannot load the iplike.so file. This is almost always caused by the PostgreSQL server and the iplike.so file being compiled for different processor instruction sets. This is commonly seen when the PostgreSQL server is compiled to use a 64-bit instruction set but the OpenNMS iplike.so shared object is compiled for a 32-bit instruction set, although the opposite is possible, as well. You can use the "file" command on iplike.so and the postmaster binary with PostgreSQL to check their instruction sets.

The easiest solution is to see if there is a packaged version of OpenNMS compiled for the same instruction set (32- or 64-bit) as your PostgreSQL server. The next easiest method for most users is to switch the PostgreSQL server to match the instruction set that the iplike.so file was compiled for. For advanced users, you can compile OpenNMS yourself to fit the processor set that you need. See this post to the discuss list for some pointers.

Error: "Exception in thread "main" org.postgresql.util.PSQLException: ERROR: relation "pg_user" does not exist" when running installer.

This error means the database was not created properly. Since the installer script is supposed to create the database, one might assume it is a problem with OpenNMS, but instead it is an issue with the SELinux portions of Red Hat 4 (and CentOS 4). Basically, the postgres init_db command is not able to write to /dev/null, and it fails without a useful error message.

To get around this, run the following commands:

  1. stop postgres

  2. rm -rf /var/lib/pgsql/data

  3. /usr/sbin/setenforce 0

  4. start postgres

Note that step 2 will delete any changes you made to the postgresql configuration files and you'll need to redo them.

Error: java.io.FileNotFoundException: ... (Permission denied)

An exact example of this error is: "java.io.FileNotFoundException: /opt/opennms/etc/users.xml (Permission denied)".

If the above error happens when using admin functions through the web interface, such as managing users, notifications, and adding nodes, then the Tomcat web server is running as a non-root user but you haven't changed the permissions on the configuration files so the Tomcat user can access them. Go back and follow the instructions earlier in the install guide on setting up Tomcat to run as a non-root user.

Where to Get Help

OpenNMS is a community supported project. Please keep that in mind when seeking help on the program, as no one gets paid to work on the project (unless it is through a commercial support contract).

The Release Notes

Check the release notes for this release. They are in the Documentation section of the OpenNMS project page at SourceForge.

The OpenNMS Web Site

The main OpenNMS site is a Wiki. As a community project, there is a lot of good advice and information available there. In particular, we suggest checking the above-mentioned release notes, the FAQ entries on the wiki, the bug database and, of course, Google, before posting to a mailing list.

The OpenNMS Mailing Lists

OpenNMS maintains a number of active mailing lists on SourceForge:

opennms-announce

A low traffic, moderated mailing list for OpenNMS announcements. All posts to this list are duplicated on the opennms-discuss list.

opennms-cvs

This is a fairly high traffic list of all updates to the Subversion repositories on SourceForge. Moderated. Only SVN updates are posted here (no discussion).

opennms-devel

This list is for discussion of development of the OpenNMS codebase.

opennms-discuss

This is the main OpenNMS discuss list. It's pretty friendly, and reasonably high-volume. It tends to focus on configuration issues and general discussion of network management, but pretty much anything goes here. However, it is suggested that installation-related issues go to the opennms-install list instead.

opennms-install

This is a great list for new users to OpenNMS. The main focus is installation issues (cleared up by this great documentation, right?) but most "newbie" questions are welcome here.

opennms-maps

OpenNMS has a network map feature, which includes code for automatically determining relationships between hosts (Linkd). This is the appropriate list for discussion of maps and the underlying Linkd code.

opennms-windows

A discussion list for people running OpenNMS on Windows.

opennms-francais

A list for discussion of OpenNMS in French.

opennms-italia

A list for discussion of OpenNMS in Italian.

opennms-ug-tokyo

A list for discussion of OpenNMS in Japanese, as well as general discussion among the Tokyo OpenNMS Users Group.

opennms-ug-uk

A list for discussion of OpenNMS in UK English for those who don't speak American English (OK, just kidding). Actually, a discussion list for the UK OpenNMS Users Group. ;)

The OpenNMS mailing lists are also archived at gmane.org.

Commercial Support

If you are using OpenNMS in a production environment, or are considering it, you might be interested in commercial support. The OpenNMS Group maintains the OpenNMS project, and we also offer support, training, consulting services and custom development.

OpenNMS® Installation Guide (2024)
Top Articles
Latest Posts
Article information

Author: Annamae Dooley

Last Updated:

Views: 5459

Rating: 4.4 / 5 (45 voted)

Reviews: 84% of readers found this page helpful

Author information

Name: Annamae Dooley

Birthday: 2001-07-26

Address: 9687 Tambra Meadow, Bradleyhaven, TN 53219

Phone: +9316045904039

Job: Future Coordinator

Hobby: Archery, Couponing, Poi, Kite flying, Knitting, Rappelling, Baseball

Introduction: My name is Annamae Dooley, I am a witty, quaint, lovely, clever, rich, sparkling, powerful person who loves writing and wants to share my knowledge and understanding with you.