Dime – Collecting JBoss Management Notifications

I wanted to monitor the applications that are deployed on ~10 JBoss servers and these JBoss servers are divided up on different networks.

Some searching and I quickly found a way to hook into the different Deployers and listen to the Notifications that JBoss emits. Seemed simple enough and I quickly wrote a simple client application that I could deploy to each of the servers and then a Manager app that the servers would send the Notifications to so that I could display the current state of all the applications on all the various servers including what version the app deployed has for future reference.

The notifications are sent whenever an application is deployed or undeployed so one use of the application is to show the state of our nightly builds after the code passes compilation and initial JUnit testing. The application will show whether the any app failed to deploy and also show what part of the applications caused the deployment to fail.

Hence the name Dime or Deployment Information Made Easy. :)

The final solution wasn’t achieved completely without getting deep and dirty with the JBoss source code however…

Things to note

I am only showing part of the solution below. Before a deployment enters the start method it first runs the code in the init and create methods where things may also go wrong so they were updated in a similar way. There is also some code that needs updating in the MainDeployer but again the updates are similar to the ones I made for the EARDeployer.

I’m running these fixes on JBoss 4.2.3 but I know the problem exists in other versions as well. Since the application part of  Dime is a basic javax.management.NotificationListener I wouldn’t be surprised if my code could be used on other Applications Servers such as IBMs Websphere. For small businesses trying to keep costs down I think it is a nice solution to give someone an overview of what applications are deployed (including versioning) on different servers and to quickly get notified of a failed deployment.

Missing notifications or notifications at the wrong time…

When I started I quickly realized that I was only receiving some notifications. I outlined the problem in a forum post on the JBoss forums but never got any responses so it I decided to find out for myself using the JBoss source code, but outlined quickly, the problems were:

  • on a successful deployment the final notification was START_DEPLOYER
  • on an undeploy, the first notification was STARTED which was the missing notification from the previous point.
  • on a failed deployed the final notification was also delayed… sometimes. The FAILED notification was sometimes sent on undeploy (quite pointless) or would not be sent at all. This was a bit dependant on why the deployment failed.

Solution

Going through the source code I quickly realized that the reason I sometimes did not receive notification was because they just weren’t being sent. An example is the start method of the EARDeployer. When an EAR has been deployed on JBoss you will see a line in the log file similar to

17:33:23,945 INFO  [org.jboss.deployment.EARDeployer] Started J2EE application: file:/home/daniel/me
ssaging-platform/JBoss423a/server/custom1/deploy/webs/100.test.ear

Looking at the source code of the start method we can easily see where this is logged and how nothing else is done after it has been logged.

public void start(DeploymentInfo di)
      throws DeploymentException
   {
      super.start (di);
      try
      {
         // Commit the top level policy configuration
         PolicyConfiguration pc = (PolicyConfiguration)
            di.context.get("javax.security.jacc.PolicyConfiguration");
         pc.commit();
         Policy.getPolicy().refresh();
         serviceController.start(di.deployedObject);
      }
      catch (Exception e)
      {
         DeploymentException.rethrowAsDeploymentException("Error during start of EARDeployment: " + di.url, e);
      }
      log.info ("Started J2EE application: " + di.url);
   }

What I was expecting to see here is some code updating the DeploymentInfo with a DeploymentState of STARTED and then notifiying the Listeners. I also expected to see some form of error handling where the listeners would be notified of a failure but the method just throws any DeploymentExceptions. These are never caught in a way that would notify listeners which means that we are missing alot of notifications about the Deployment progress.

To fix this, I changed the code to be

public void start(DeploymentInfo di)
      throws DeploymentException
   {
       try
       {
          super.start (di);
       }
       catch (DeploymentException e)
       {
          di.state = DeploymentState.FAILED;
          sendNotification(di);

          throw e;
       }

      try
      {
         // Commit the top level policy configuration
         PolicyConfiguration pc = (PolicyConfiguration)
            di.context.get("javax.security.jacc.PolicyConfiguration");
         pc.commit();
         Policy.getPolicy().refresh();
         serviceController.start(di.deployedObject);
      }
      catch (Exception e)
      {
         di.state = DeploymentState.FAILED;
         sendNotification(di);

         DeploymentException.rethrowAsDeploymentException("Error during start of EARDeployment: " + di.url, e);
      }
      log.info ("Started J2EE application: " + di.url);

      di.state = DeploymentState.STARTED;
      sendNotification(di);
   }

Now we are catching any exceptions that may have occurred during the deployment and also notifying the listeners that an error occurred. Once the exception is caught and the listeners have been notified we simple rethrow the exception again. We are also sending a notification once the application has started.

How to use

The least evasive way to get these fixes and the way I updated our servers was to replace the EARDeployer and MainDeployer class files in the JBoss jars manually. I've been running the servers for two weeks since I made these changes and haven't had any issues relating to my changes.

Singleton in a Cluster

I needed to find a way to run a Quartz Cron Job on only one node in my cluster and failover to the second node should the first fail.

My first try was to deploy it to the deploy-hasingleton directory which was supposed to work as I wanted. Unfortunately my app which runs without problems in both the deploy and farm directories threw a whole lot of exceptions when I tried to deploy it in deploy-hasingleton. Since deploy-hasingleton doesn’t allow hot-deploy of apps which is a shame.

I did find a solution though.

@ResourceAdapter("quartz-ra.rar")
@MessageDriven(activationConfig = { 
   @ActivationConfigProperty(propertyName = "cronTrigger", 
                             propertyValue = "0 */1 * * * ?") })
public class MyJob implements Job
{
    public void execute(JobExecutionContext jobExecutionContext) throws JobExecutionException
    {
          System.err.println("Hello World!");
    }
}

The code above creates a job which will print "Hello World!" to System.err every minute on the minute as you'd expect from a cron job. In a clustered environment it will do so on every node.

Modifying the code to

@ResourceAdapter("quartz-ra.rar")
@MessageDriven(activationConfig = {
   @ActivationConfigProperty(propertyName = "cronTrigger",
                             propertyValue = "0 */1 * * * ?")
})
@Depends("jboss.ha:service=HASingletonDeployer,type=Barrier")
public class MyJob implements Job
{
    public void execute(JobExecutionContext jobExecutionContext) throws JobExecutionException
    {
          System.err.println("Hello World!");
    }
}

will make it so that the job will only run if the node it is running on is considered the Master node by the HASingletonDeployer. Kill the node it is running on and it will start running on the node that became the new master. :)

MySQL Clustering Config

I’ve had two positive comments to my previous post so I figured it was time to write an update regarding how my work has been going.

The MySQL clustered database is part of a large project I’ve been working on for the last 2 months. The basic server setup is two clustered JBoss 4.2.3 Application Servers running on top of two clustered MySQL 5.0.67 servers. There is a 3rd Server which is a backup which currently only runs the MySQL manager ment console. I’ve noticed that even if there is high load on the 2 NDB data nodes, the Management console does not do much.

The four main servers run 2 Dual-Core AMD Opteron(tm) Processors with 8 gig of ram.

Even though alot of my work has been to rework my own code in order to optimize and improve it, alot of time has been spent looking for a configuration for the MySQL cluster that would cope with the load as I found that it was quite easy to kill the cluster in the beginning.

The config I am currently using is below and following are some results of the load testing I’ve been doing.

[TCP DEFAULT]
SendBufferMemory=2M
ReceiveBufferMemory=2M

[NDBD DEFAULT]
NoOfReplicas=2

# Avoid using the swap
LockPagesInMainMemory=1

#DataMemory (memory for records and ordered indexes)
DataMemory=3072M

#IndexMemory (memory for Primary key hash index and unique hash index)
IndexMemory=384M

#Redolog
# (2*DataMemory)/64MB which is the LogFileSize for 5.0
NoOfFragmentLogFiles=96

#RedoBuffer of 32M. If you get "out of redobuffer" then you can increase it but it
#more likely a result of slow disks.
RedoBuffer=32M

MaxNoOfTables=4096
MaxNoOfAttributes=24756
MaxNoOfOrderedIndexes=2048
MaxNoOfUniqueHashIndexes=512

MaxNoOfConcurrentOperations=1000000

TimeBetweenGlobalCheckpoints=1000

#the default value for TimeBetweenLocalCheckpoints is very good
TimeBetweenLocalCheckpoints=20

# The default of 1200 was too low for initial tests. But the code has been improved alot
# so 12000 may be too high now.
TransactionDeadlockDetectionTimeout=12000
DataDir=/var/lib/mysql-cluster
BackupDataDir=/var/lib/mysql-cluster/backup

[MYSQLD DEFAULT]

[NDB_MGMD DEFAULT]

# Section for the cluster management node
[NDB_MGMD]
# IP address of the management node (this system)
HostName=10.30.28.10

# Section for the storage nodes
[NDBD]
# IP address of the first storage node
HostName=10.30.28.11

[NDBD]
# IP address of the second storage node
HostName=10.30.28.12

# one [MYSQLD] per storage node
[MYSQLD]
[MYSQLD]

Here are some numbers of what I've been able to get out of the MySQL cluster. One iteration of my code results in:
12 selects
10 inserts
10 updates
4 deletes

This might sound like a lot, but it is a service oriented application that relies on persistent JBoss queues and demands 100% redundancy so even if the server dies, it will pickup where it died and does not loose any data.

My first benchmark was 10 000 iterations which is the current load I can expect on the application over the course of an hour i.e.
12 000 selects
10 000 inserts
10 000 updates
4 000 deletes

This took a total of 93 seconds producing just over 100 iterations per second.

The second test to really put the application and environment to the test was to run 100 000 iterations. This test completed in roughly 15 minutes producing around 110 iterations per second. This is about 10x the load we'd expect to see during the course of one hour but it is nice to see the setup has the ability to grow quite a bit before we need more hardware. :)

I am currently working on setting up a third test which will run 100 000 iterations every hour for 10 hours producing 1 millions rows of data.

MySQL Clustering on Ubuntu

I spent some time getting MySQL clustering working with Ubuntu after reading a guide on Howto Forge. The guide however went into the details of compiling and installing MySQL from source so I’m creating this to show the steps needed to get it set up on a fresh Ubuntu installation.

For a correct setup you will need 3 machines. The first machine will serve as the management node, and the other two will be storage nodes.

At the time of writing, the current stable version of Ubuntu is 8.04.1 and the MySQL version that is installed is 5.0.51

During the configuration I log onto the machines and use the command

sudo su -

to gain permanent root access and saving myself from having to type sudo in front of every command. Use your own discretion.

Installing MySQL

Using apt this is straight forward. Just type the following command on all three machines to install MySQL server.

apt-get install mysql-server

Once asked to, set the root password to the MySQL database. You'll need to remember this one. Once MySQL server is installed we'll proceed to configure the management node.

Configuring the Management Node

Create and edit the file /etc/mysql/ndb_mgmd.cnf. Copy and paste the text bellow changing the ip addresses to match your setup as necessary.

[NDBD DEFAULT]
NoOfReplicas=2
DataMemory=80M    # How much memory to allocate for data storage
IndexMemory=18M   # How much memory to allocate for index storage
# For DataMemory and IndexMemory, we have used the
# default values. Since the "world" database takes up
# only about 500KB, this should be more than enough for
# this example Cluster setup.
[MYSQLD DEFAULT]
[NDB_MGMD DEFAULT]
[TCP DEFAULT]
# Section for the cluster management node
[NDB_MGMD]
# IP address of the management node (this system)
HostName=192.168.1.5

# Section for the storage nodes
[NDBD]
# IP address of the first storage node
HostName=192.168.1.6
DataDir=/var/lib/mysql-cluster
BackupDataDir=/var/lib/mysql-cluster/backup
DataMemory=512M
[NDBD]
# IP address of the second storage node
HostName=192.168.1.7
DataDir=/var/lib/mysql-cluster
BackupDataDir=/var/lib/mysql-cluster/backup
DataMemory=512M

# one [MYSQLD] per storage node
[MYSQLD]
[MYSQLD]

Configuring the Storage Nodes

As you can see in the file we created in the previous step, the cluster will be using /var/lib/mysql-cluster on the storage machines. This path is created when you install MySQL server but they are owned by root. We want to create the backup directory and change ownership to mysql.

mkdir /var/lib/mysql-cluster/backup
chown -R mysql:mysql /var/lib/mysql-cluster

Now we'll need to edit the MySQL configuration so that the storage nodes will communicate with the Management Node.

Edit /etc/mysql/my.cnf

Search for [mysqld] and add the following.

[mysqld]
ndbcluster
# IP address of the cluster management node
ndb-connectstring=192.168.1.5

Then scroll down to the bottom until you see [MYSQL_CLUSTER]. Uncomment the line and edit so it looks like

[MYSQL_CLUSTER]
ndb-connectstring=192.168.1.5

The reason the connect string it found twice in the mysql file is because one is used by mysql server, and the other is used by the ndb data node app. Save the changes to the file.

Make sure you complete the changes on both data nodes.

Start the Management Node

Start the Management Node using

/etc/init.d/mysql-ndb-mgm restart

The process shouldn't be running but using restart doesnt hurt. Once it is started we can access the management console using the command ndb_mgm. At the prompt type show; and you will see

ndb_mgm> show;
Connected to Management Server at: localhost:1186
Cluster Configuration
---------------------
[ndbd(NDB)]    2 node(s)
id=2 (not connected, accepting connect from 192.168.1.6)
id=3 (not connected, accepting connect from 192.168.1.7)

[ndb_mgmd(MGM)]    1 node(s)
id=1    @192.168.1.5  (Version: 5.0.51)

[mysqld(API)]    2 node(s)
id=4 (not connected, accepting connect from any host)
id=5 (not connected, accepting connect from any host)

As you can see the management node is waiting for connections from the data nodes.

Start the Data Nodes

On the data nodes, issue the commands

/etc/init.d/mysql restart
/etc/init.d/mysql-ndb restart

Go back to the management node, type show; again, and now you should see something similar to

id=2    @192.168.1.6  (Version: 5.0.51, starting, Nodegroup: 0)
id=3    @192.168.1.7  (Version: 5.0.51, starting, Nodegroup: 0)

Once they have started properly, the show command should display

ndb_mgm> show;
Cluster Configuration
---------------------
[ndbd(NDB)]    2 node(s)
id=2    @192.168.1.6  (Version: 5.0.51, Nodegroup: 0, Master)
id=3    @192.168.1.7  (Version: 5.0.51, Nodegroup: 0)
[ndb_mgmd(MGM)]    1 node(s)
id=1    @192.168.1.5  (Version: 5.0.51)
[mysqld(API)]    2 node(s)
id=4    @192.168.1.7  (Version: 5.0.51)
id=5    @192.168.1.6  (Version: 5.0.51)

Congratulations, your cluster is now setup.

Testing the cluster

Issue the following on both data nodes to create the test database. Since clustering is done on a table basis in MySQL we have to create the database manually on both data nodes.

$> mysql -u root -p
Enter password:
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 8
Server version: 5.0.51a-3ubuntu5.1 (Ubuntu)

Type 'help;' or '\h' for help. Type '\c' to clear the buffer.

mysql> create database clustertest;
Query OK, 1 row affected (0.00 sec)

Once this i done, on ONE of the data nodes, create a test table and add an entry.

mysql> use clustertest;
Database changed
mysql> create table test (i int) engine=ndbcluster;
Query OK, 0 rows affected (0.71 sec)

mysql> insert into test values (1);
Query OK, 1 row affected (0.05 sec)

mysql> select * from test;
+------+
| i    |
+------+
|    1 |
+------+
1 row in set (0.03 sec)

We've just created a table test, added a value to this table and made sure that the table contains one entry. Note that engine=ndbcluster must be used to let MySQL know that this table should be clustered among the data nodes. Let's make sure that the table is infact created on the other data node, and contains one entry.

mysql> use clustertest;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Database changed
mysql> show tables;
+-----------------------+
| Tables_in_clustertest |
+-----------------------+
| test                  |
+-----------------------+
1 row in set (0.01 sec)

mysql> select * from test;
+------+
| i    |
+------+
|    1 |
+------+
1 row in set (0.04 sec)

As you can see, the cluster is working.

Moving an existing database to the cluster

Now that we have the cluster working, we can easily change an existing database to be clustered. All you need to do is run the following command on each of the tables.

alter table my_test_table engine=ndbcluster;

The table, and all it's data will be copied to the datanodes and you can now access/change then through any nodes in the cluster. Very simple.

Follow

Get every new post delivered to your Inbox.