I’ve used Hibernate in the past with EJB3 annotations. Frequently I needed to map query results that didn’t correspond to Entity objects to value objects that could then be passed to the business or view tier. This is fairly trivial using the Hibernate Query API’s:


 public List<CustomerCityInfo> getCustomerCityInfo()
 {
     Query query = this.sessionFactory.getCurrentSession().createQuery("SELECT c.name, c.address.city from Customer c");
     query.setResultTransformer(Transformers.aliasToBean(CustomerCityInfo.class));
     return query.list();
 }

I ran into this need again when I started using JPA with the Hibernate JPA provider. I didn’t want to write non-portable code so I didn’t want to have to access the underlying Hibernate Session object.

My first naive attempt looked something like this:


 public List<CustomerCityInfo> getCustomerCityInfo()
 {
     Query query = entityManager.createQuery("SELECT c.name, c.address.city from Customer c");
     List<Object[]> resultList = query.getResultList();
     List<CustomerCityInfo> customerList = new ArrayList<CustomerCityInfo>();
     for (Object[] obj : resultList)
     {
         CustomerCityInfo customer = new CustomerCityInfo();
         customer.setName(obj[0]);
         customer.setCity(obj[1]);
         customerList.add(customer);
     }

     return customerList;
 }

Yuck!

I had some spare cycles recently so I did some google searching and discovered the “SELECT NEW” syntax that JPQL supports. This allows you to do something like this instead of the monstrosity above:


 public List<CustomerCityInfo> getCustomerCityInfo()
 {
     Query query = entityManager.createQuery("SELECT NEW com.mycompany.model.CustomerCityInfo(c.name, c.address.city) from Customer c");
     return query.getResultList();
 }

Much cleaner syntax. Much easier to maintain.

Cheers!

Have you ever found yourself doing something like this in some code you’ve been writing?


String xml = "<adocument><anelement>" + someproperty + "</anelement><anotherelement>" 
                 + anotherproperty + "</anotherelement></adocument>";

Yuck!

I’m a lazy developer. And I know from experience that at some point, I’m going to have to change this embedded xml generating pile of crap. Someone will ask me to add an element, or remove an element, or add a namespace, or change the formatting, or …… Ugh! Maintaining this kind of code is what makes maintenance programmers go insane….

Fortunately, I’ve spent enough time doing maintenance development that I know that a little time spent upfront will save a lot of time and frustration down the road and I try to avoid naive solutions like the above in favor of something that appeals to my laziness.

One simple, yet effective, solution to the above problem is to use Velocity. Velocity is a powerful scripting language and makes short work of the simple example above. The main advantage is that the template is just text and text is easy to maintain.

Here’s how we’d go about refactoring the above solution…

First, I’d create a velocity template adocument.vm with the following content:

<?xml version="1.0" encoding="UTF-8"?>
<adocument>
  <anelement>${someproperty}</anelement>
  <anotherelement>${anotherproperty}</anotherelement>
</adocument>

Then, I’d configure my velocity engine in my spring context:

    <bean id="velocityEngine" class="org.springframework.ui.velocity.VelocityEngineFactoryBean">
      <property name="velocityProperties">
       <value>
        resource.loader=class
        class.resource.loader.class=org.apache.velocity.runtime.resource.loader.ClasspathResourceLoader
       </value>
      </property>
    </bean>

The properties tell Velocity how to find the template files when I give it the template file path. I’ve specified that I want Velocity to use the ClasspathResourceLoader to find my templates, which are likely packaged up inside my application archive (war, jar).

Now, I just wire up the class that is generating the xml to the engine:

private VelocityEngine engine;

public void setVelocityEngine(VelocityEngine engine)
{
    this.engine = engine;
}
<property name="velocityEngine" ref="velocityEngine"/>

And tell it where the template is:

String templateFile;

public void setTemplateFile(String file)
{
    this.templateFile = file;
}
<property name="templateFile" value="myTemplatesDirectory/adocument.vm"/>

myTemplatesDirectory should be a directory in your classpath (packaged in your jar?) that contains your template.

Then, I change how I’m generating the xml:

Map model = new HashMap();
model.put("someproperty", someproperty);
model.put("anotherproperty", anotherproperty);
String xml = VelocityEngineUtils.mergeTemplateIntoString(velocityEngine, templateFile, model);

Note that this solution still violates the open closed principle but that can be solved by making the generation code a reuseable module and having the client pass in the variables:

public String generateContent(Map model, String template)
{
    return VelocityEngineUtils.mergeTemplateIntoString(velocityEngine, template, model);
}

You may still need to update how the model is generated if the view needs to change (e.g. to add a new field) but you won’t have to muck around with any of the software to make simple textual changes to the xml. That’s a solution that appeals to my laziness.

It should be obvious to some of you that this refactoring is essentially a simple MVC solution. Velocity can be, and frequently is, used in place of solutions like jsp to generate views. However, as shown above, it is a powerful tool in your toolbox that can easily be used in your middleware for generating content that could, for instance, become the payload of a JMS message.

Cheers!

We have quite a few applications that use, what amounts to, a hand rolled implementation of an ESB.

Typically, the system design includes several JMS queues which dispatch to a command chain for processing. The usage of the chain of responsibility pattern allows us to loosely couple components and processing logic.

In its simplest form, the system looks something like this:

Messages are received on the gateway and put on the associated JMS queue. The gateway can support a variety of pluggable protocols for inbound and outbound messaging(e.g. HTTP-REST, HTTP-WS, FTP, and even custom socket/udp based protocols). These gateway plugins are comparable with JBI Binding Components. Messages received by the inbound queue listener are dispatched to a processing chain which typically performs a variety of processing steps (which usually includes persisting some data to the database). At the end of the processing chain, a message is usually put on the reciprocal JMS queue constituting the reply to the original message.

The major difference between our application design and a canonical ESB is that our messages are persisted in the database while they are in flight and a unique identifier (e.g. PK) for the message stored in the database is passed through the system. An ESB, on the other hand, uses message passing semantics where the ENTIRE message is passed along.

Why is this distinction important?

Because the entire message is continually passed along between components in an ESB (via the Normalized Message Router), the receiver (e.g. JBI Service Engine) always has an up to date ‘view’ of the message. This is not always the case in our system and that can cause problems if we aren’t prepared for it.

We use an XA transaction to manage the JMS receive, any affiliated database updates and the terminating send of the message at the end of the processing chain. It’s easy, however, to misinterpret WHAT the XA semantics guarantee with HOW they operate in practice.

Consider the case in which we store a message in the database and send the PK for the newly inserted message to another JMS queue for subsequent processing. When the transaction commits, the transaction manager will eventually invoke commit() on each participant in turn. What is important to remember, however, is that the specification has no guarantees about WHEN the participants will be committed, only that when the transaction is complete, the state will eventually be consistent to an outside observer.

What if immediately after the transaction manager invokes commit() on the jms resource, the thread of execution is suspended? The receiver listening on that queue may very well receive the message and start processing it before the transaction manager ever even invokes the commit() on the database resource. As such, in typical transaction isolation semantics of Read Committed, this record will not be visible yet to the JMS receiver because it has not been committed yet by either the transaction manager or the underlying database resource. Assuming that the absence of the record indicated by the PK in the message is an unrecoverable error would be a programming error. It should be expected that this will occur and the system should allow for the message delivery to be rolled back and reattempted at some later date (when the database resource is synchronized with the rest of the system).

It is easy to be lulled into complacency in this scenario, because IN PRACTICE, you will rarely ever see the dreaded “record not found” error. This is because, in most situations, the commit happens so quickly on all the resources that the database IS in sync before the JMS listener receives the message. Furthermore, if you are using a non-XA database resource and relying on your transaction manager to emulate XA by using some variant of the Last Resource Gambit, this will almost always mean that the transaction manager commits the database resource BEFORE the JMS resource. However, relying on this to be the case is dangerous because we’ve come across at least one transaction manager (Atomikos) for which this didn’t hold true.

In summary, remember that XA transactions guarantee you WHAT the state of the system will be but make no guarantees on WHEN that will occur. Failure to observe this distinction may cause you sleepless nights when your software is in production.

I’m currently working on an application that allows an administrative user to upload an input file that contains thousands of records that need to be processed.

Our initial naive solution to this problem was to parse the upload file into individual records, save the record to the database and then send a JMS message containing the PK of the record to a JMS queue to kick off asynchronous processing for each record.

It worked great….until we tried to process a file with more than a few thousand records. Then the transaction that the file processing was running in started timing out. That’s a problem.

You see, our benchmark for file processing is 10k records–that’s the expected size of the files we will start receiving once our system goes live. We’d also like to have some additional breathing room in processing capability in case we see larger files in the wild. But, as it was, not only did we not have any breathing room, we had a giant boulder sitting on our chest crushing us.

What to do?

It was about this time that I reflected on the need for an entirely new approach, not a variation on our existing processing strategy. Spring Batch to the rescue.

The Spring Batch framework is a new addition to the Spring portfolio. The primary committers on the project have extensive experience producing batch processing solutions. It is primarily designed as a Java based batch processing framework using traditional batch processing strategies. In other words, it wasn’t a natural fit for my needs: kicking off batch processing with dynamic input from within a web application.

Many (most) of the collaborators in a spring batch configuration are stateful so they don’t lend themselves to traditional singleton based spring context wiring. I originally tried defining these beans as prototypes but the problem was that one of the stateful collaborators needs to be injected TWICE into one of the components (as a collaborator and to register as a listener). It couldn’t be defined as a prototype or two different objects would be injected and that wouldn’t work.

In the documentation and on the forums, the Spring Batch team has suggested that you just use a new ApplicationContext for each run of your batch job. All of the examples that I’ve found construct the ENTIRE application context for each run of the batch job(s). That won’t work in my case because my application context is being used by the entire web application. I can’t refresh it (and the hundreds of beans it contains) just to breath new life into a handful of stateful spring batch beans.

Enter the ClassPathXmlApplicationContextJobFactory. This class allows you to construct your batch job beans from an existing parent ApplicationContext and a subcontext bean definition file. The key to this is that every time you request a job from the factory, it constructs a brand new sub context and wires up your batch job beans to the rest of it’s singleton collaborators in your parent application context. Yippee!

The only gripe I had about this JobFactory implementation was that it expected the properties to be injected as constructor arguments (including the parent application context) and I wasn’t aware of a way to pass the application context into a bean unless the bean implemented ApplicationContextAware. Hence, I created the following wrapper class that supports property injection making configuration a snap:

public class ContextAwareJobFactory implements JobFactory, ApplicationContextAware, InitializingBean
{
    private ClassPathXmlApplicationContextJobFactory delegate;
    
    /* The parent application context */
    private ApplicationContext applicationContext;
    /* The job bean name */
    private String beanName;
    /* resource path to subcontext spring config */
    private String subcontextPath;

    public void setBeanName(String beanName)
    {
        this.beanName = beanName;
    }

    public void setSubcontextPath(String subcontextPath)
    {
        this.subcontextPath = subcontextPath;
    }

    /* (non-Javadoc)
     * @see org.springframework.batch.core.configuration.JobFactory#createJob()
     */
    @Override
    public Job createJob()
    {
        return delegate.createJob();
    }

    /* (non-Javadoc)
     * @see org.springframework.batch.core.configuration.JobFactory#getJobName()
     */
    @Override
    public String getJobName()
    {
        return delegate.getJobName();
    }

    /* (non-Javadoc)
     * @see org.springframework.context.ApplicationContextAware#setApplicationContext(org.springframework.context.ApplicationContext)
     */
    @Override
    public void setApplicationContext(ApplicationContext context) throws BeansException
    {
        this.applicationContext = context;
    }

    /* (non-Javadoc)
     * @see org.springframework.beans.factory.InitializingBean#afterPropertiesSet()
     */
    @Override
    public void afterPropertiesSet() throws Exception
    {
        delegate = new ClassPathXmlApplicationContextJobFactory(this.beanName, this.subcontextPath, this.applicationContext);
    }
}

Configuration is then very simple:

    <bean id="jobFactory" class="com.mybatch.ContextAwareJobFactory">
      <property name="beanName" value="myJob"/>
      <property name="subcontextPath" value="classpath:spring/mybatch-processing-prototype-beans.xml"/>
    </bean>

Note that all of the stateless beans that my batch process uses (e.g. FieldSetMappers, LineTokenizers, etc) are stored in the parent context. It’s only beans that are either stateful or are injected with stateful beans that are defined in the sub context “prototype” beans file.

And that’s it! Presto, I’m able to process input files with thousands or tens of thousands of records with no problem. Spring Batch also supports chunking and restart so if your batch job gets interrupted, it can be restarted again and pickup where it left off and continue processing.

If you are Spring addicted and find yourself in need of a batch processing solution, I’d suggest that you give Spring Batch a long look.