JAW Speak

Jonathan Andrew Wolter

Archive for the ‘java’ Category

Spring Slow Autowiring by Type getBeanNamesForType fix 10x Speed Boost >3600ms to <100ms

with 8 comments

Reading time: < 1 minute

We’re using Spring MVC, configured mostly by annotations, a custom scope for FactoryBeans so they don’t get created once per request, and autowiring by type. This makes for simple code, where the configuration lives right along where it is configured, however after a certain number of autowired beans, performance was abysmal.

Loading the main front page in Firefox, with firebug showing slowness:
baseline-firebug

Click on to read the full post.
Read the rest of this entry »

Written by Jonathan

November 28th, 2010 at 4:25 am

Posted in code, java

Tagged with

Spring Bean Autowiring into Handler/Controller Method Arguments

with 2 comments

Reading time: 2 – 4 minutes

We request scope controllers in our Spring MVC web application, and then spring injects in collaborators. Each request builds up an object graph of mostly request scoped components, they service the request, and are then garbage collected. Examples of collaborators are Services, request-scoped objects like User, or objects from session. Declare these injectables as dependencies of your controller, and the IoC framework will resolve those instances for you. Using Spring’s FactoryBean, you can have custom logic around retrieving from a DB or service call. (Also, if you do request scoping for everything you will probably want to make your FactoryBeans caching, or create a custom scope, so Spring doesn’t recreate each FactoryBean, and make a remote call possibly, on every object injection.)

Instantiate a graph of request-scoped objects per request. Is that crazy? Not really, because garbage collection and object instantiation are very fast on modern JVM’s. I argue that it it helps you have cleaner and more maintainable code. Designs have better object orientation, tests don’t require as many mocks, and it’s easier to obey the single responsibility principle.

None of this is new, though. Here’s where it gets interesting for us. We have @RequestMapping methods on a controller, but only one of them needs the injected collaborator. If it is slow to retrieve the collaborator (such as a CustomerPreferences we get from a remote call), we don’t want to call it every time. Sometimes this means you need two controllers, other times you want to let any spring bean be injected into a handler method.

We extended Spring to inject any spring bean into a controller/handler’s @RequestMapping or @ModelAttribute annotated methods. You benefit with only injecting the bean into the handler method that needs it, possibly preventing remote calls to lookup said dependency if it were a field on a controller with multiple @RequestMapping’s.

Here’s a sample controller’s handler method. Before:

	@RequestMapping(value = Paths.ACCOUNT, method = GET)
	public String showAccount(Map model) {
	      model.put("prefs", customerService.getCustomerPrefs(session.getCustomerId()));
	      return "account";
	}

There is an extension point WebAttributeResolver for you to add your own logic to resolve method parameters (such as autowire them as spring resolvable beans). After:

	@RequestMapping(value = Paths.ACCOUNT, method = GET)
	public String showAccount(Map model,
	          @AutowiredHandler CustomerPreferences customerPreferences) {
	      // the CustomerPreferences has a CustomerPreferencesFactoryBean which
	      // knows how to make a remote service call and retrieve the prefs.
	      model.put("prefs", customerPreferences)); // easier to test
	      return "account";
	}

This lets you inject any spring bean into fields/methods annotated with @ModelAttribute or @RequestMapping.

Another very helpful way to do this is to automatically inject User or other request-specific objects.

References for further discussion/examples on the topic:
http://karthikg.wordpress.com/2009/10/12/athandlerinterceptor-for-spring-mvc/
http://karthikg.wordpress.com/2010/02/03/taking-spring-mvc-controller-method-injection-a-step-further/

Written by Jonathan

November 20th, 2010 at 4:13 am

Posted in code, java

Tagged with

Session State – it’s not complicated, but options are limited

with 2 comments

Reading time: 3 – 4 minutes

Web apps have numerous choices for storing stateful data in between requests. For example, when a user goes through a multi-step wizard, he or she might make choices on page one that need to be propagated to page three. That data could be stored in http session. (It also could get stored into some backend persistence and skip session altogether).

So where does session get stored? There are basically four choices here.

  1. Store state on one app server, i.e. for java in HttpSession. Subsequent requests need to be pinned to that particular app server via your load balancer. If you hit another app server, you will not have that data in session.
  2. Store state in session, and then replicate that session to all or many other app servers. This means you don’t need a load balancer to anchor a user to one app server; multiple requests can either hit any app server (full replication). Or use the load balancer to pin to particular cluster (themselves replicating sessions, and giving higher availability). Replication can be smart so only the deltas of binary data are multicast to the other servers.
  3. Store the state on the client side, via cookies, hidden form fields, query strings, or client side storage in flash or html 5. Rails has an option to store it in cookies automatically. Consider encrypting the session data. However, some of these options can involve a lot of data going back and forth on every request, especially if it’s in a cookie and images/scripts are served from the same domain.
  4. Store no state on app servers, instead write everything in between requests down to backend persistence. Do not necessarily use the concept of http session. Use id’s to look up those entities. Persistence could be a relational database, distributed/replicated key/value storage, etc. Your session data is serialized in one big object graph, or as multiple specific entries.

What you choose is up to many factors, however a few guidelines help:

  1. Try to keep what is in session small.
  2. If possible, keep session state on the client side.
  3. Prefer key/value storage replicated among a small cluster over replicating all session state among all app servers.
  4. Genuinely consider sticky sessions, which home users to a particular app server. Many benefits abound, including what Paul Hammant talks about w.r.t Servlet spec 7.7.2 Appengine’s Blind Spot.
  5. If you serialize an object graph, recognize that when you do a deployment it will probably mean existing sessions are now unable to be deserialized by the newly deployed app. Avoid this by using your load balancer to swing new traffic into the new deployments, monitor errors, and then let the old sessions expire before switching all users over to the new deployment. Bleed them over.
  6. Session is not for caching. It may be tempting to store data in session for caching purposes, but soon you will need a cache.
  7. Store in session what is absolutely necessary for that user, but not more. See caches, above.

Update: see also  http://www.ibm.com/developerworks/websphere/library/bestpractices/httpsession_performance_serialization.html

Written by Jonathan

July 30th, 2010 at 3:39 pm

Posted in architecture, code, java

Tagged with

Spring Request Scoped FactoryBean returning null gets cached forever

without comments

Reading time: 2 – 4 minutes

We are using FactoryBeans extensively to own the responsibility of interesting work in constructing our object graph. These are request scoped, because they may depend on other objects that were themselves created by FactoryBeans specific to this request. For example, we have a Customer which needs an Account, so CustomerFactoryBean depends on the field Account to be injected. (We use field or setter injection because otherwise creating a factory bean that depends on another object that is created by a factory bean may create a spring exception about circular dependencies.)

package com.example.dotcom;
 
import com.example.common.Account;
import com.example.dotcom.session.CustomSession;
import org.springframework.beans.factory.FactoryBean;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.context.annotation.Scope;
 
// the scope is crucial, this controls the scope of the factory bean
@Scope("request")
public class AccountFactoryBean implements FactoryBean {
 
       // this is itself created from another factory bean
       @Autowired
       CustomSession session;
 
	@Override
	public Object getObject() throws Exception {
		// Sometimes this is going to return null, other times an instance.
                // The fix is to always return a non-null instance (despite what the
                // javadoc says). Use the Null Object pattern.
                return session.get(Account.class);
 
 
		// Note: you will need to implement caching in here or in 
		// a custom scope (which we did). Or every time you need
		// to inject the Account it will call getObject(), which
		// could be problematic if it was a slow operation.
	}
 
	@Override
	public Class getObjectType() {
		return Account.class;
	}
 
	@Override
	public boolean isSingleton() {
		return false;
	}
}

We are using autowiring by type, but we encountered a problem when the result of a factory bean’s getObject() is sometimes null. If the first call to that request scoped factory bean returned null, it would never call getObject() again in future requests. Upon investigation, you can see there is a subtle caching of null return values from factory beans. (Even if they are request scoped.) Might be a bug, I don’t know. But the work around is probably a good idea to implement anyways: use Null Objects instead of passing around null.

Here is the problematic code from Spring that caches the null return value: AutowiredAnnotationBeanPostProcessor$AutowiredFieldElement. There is a field cached and cachedFieldValue. These are set if the value is null, and then on subsequent calls they will return the null value without calling getObject on the factory bean.

I recommend not returning null in factory beans, or else it might be cached by Spring and your factory bean will not be called again when that object is needed.

Written by Jonathan

July 27th, 2010 at 11:18 pm

Posted in code, java

Tagged with

Hotspot caused exceptions to lose their stack traces in production – and the fix

with 2 comments

Reading time: 3 – 4 minutes

Today I was working with Kumar and Elian where we encountered production logs with dozens of NullPointerExceptions with no stack trace. We identified that the JIT compiler will optimize away stack traces in certain exceptions if they happen enough. The code below reproduces it. We are on jdk 1.6 (I don’t remember the minor version now).

public class NpeThief {
    public void callManyNPEInLoop() {
        for (int i = 0; i &lt; 100000; i++) {
            try {
                ((Object)null).getClass();
            } catch (Exception e) {
                // This will switch from 2 to 0 (indicating our problem is happening)
                System.out.println(e.getStackTrace().length);
            }
        }
    }
    public static void main(String ... args) {
        NpeThief thief = new NpeThief();
        thief.callManyNPEInLoop();
    }
}

Run it as follows and the issue will appear (the stack trace length changes from 2 to 0):
javac NpeThief.java && java -classpath . NpeThief
javac NpeThief.java && java -server -classpath . NpeThief

How to fix it? The following options resolve it and it stays at 2, never going to 0 as length of NPE’s stack trace:
javac NpeThief.java && java -XX:-OmitStackTraceInFastThrow -server -classpath . NpeThief
javac NpeThief.java && java -XX:-OmitStackTraceInFastThrow -classpath . NpeThief
javac NpeThief.java && java -Xint -classpath . NpeThief
javac NpeThief.java && java -Xint -server -classpath . NpeThief

So the solution is to start with -XX:-OmitStackTraceInFastThrow argument to java which instructs the JIT to remove this optimization [1,2], or operate in interpreted only mode [3]. What is going on? [2]

The JVM flag -XX:-OmitStackTraceInFastThrow disables the performance optimization of the JVM for this use case. If this flag is set, the stacktrace will be available. (editor: or if the -Xint is set).

“The compiler in the server VM now provides correct stack backtraces for all “cold” built-in exceptions. For performance purposes, when such an exception is thrown a few times, the method may be recompiled. After recompilation, the compiler may choose a faster tactic using preallocated exceptions that do not provide a stack trace. To disable completely the use of preallocated exceptions, use this new flag: -XX:-OmitStackTraceInFastThrow.” http://java.sun.com/j2se/1.5.0/relnotes.html

How does this impact performance? I did extremely naive timing and time java [args] -classpath . NpeThief. It behaved as expected, with interpreted the slowest. Is that the solution?

No. We weren’t going to change production JVM options to resolve this, instead since our isolated example code above indicated that initial exceptions would throw with the full stack trace, we went back into older logs and grepped. Sure enough, right after we deployed there were exceptions appearing with the full stack trace. That gives us enough information where we can identify and fix the bug.

Notes:
[1] Related sun bug 4292742 NullPointerException with no stack trace
[2] Helpful StackOverflow discussion of NullPointerException missing stack traces
[3] -Xint: Operate in interpreted-only mode. Compilation to native code is disabled, and all bytecodes are executed by the interpreter. The performance benefits offered by the Java HotSpot VMs’ adaptive compiler will not be present in this mode.

Written by Jonathan

May 26th, 2010 at 12:46 am

Posted in code, java, thoughtworks

12 Tips for Less* Hate of Maven

with 4 comments

Reading time: 3 – 5 minutes

We’re (stuck*) using maven on a project, and this means often the build doesn’t act like you’re expecting. Debugging requires knowledge about maven’s capabilities and less documented features. Here are some tips, please add more in the comments.

  1. The -X flag shows debug logging, and tee it to a file which you then search for clues of what happened. This is great for verifying if a plugin actually ran, and what it did. Example: mvn -X install | tee out.txt
  2. mvn dependency:tree (show the transitive dependencies of your dependencies) [1]
  3. -Dtest=MyJavaTest (just use the name of the class, no package necessary)
  4. mvn help:system (show environmental variables and system properties) [2]
  5. mvn help:active-profiles (show the active profiles to be run)
  6. mvn help:all-profiles (show all profiles maven reads)
  7. mvn help:effective-pom (show the logical pom.xml being used)
  8. mvn help:effective-settings (show the logical settings.xml which is being used)
  9. mvn help:describe -DgroupId=org.apache.maven.plugins -DartifactId=maven-resources-plugin -Ddetail (show details about a plugin’s options)
  10. -DskipTests this will skip running tests, but still compile it. or -Dmaven.test.skip=true which will skip compilation and execution of tests (use with judgment, I believe in testing)
  11. If -X doesn’t show you the properties when you needed them, you can also use the ant-run plugin to echo a property value (does anyone else know a better way to do this in maven? Tell me what a property value is at the time of the plugin or goal executing?)
  12. Run your tests paused in debug mode, waiting for a debugger to attach to port 5005 with mvn -Dmaven.surefire.debug test [4]

References:

*Note: Initially I disliked the confusion and long, slow learning curve with maven. After working 8 months with maven and my colleagues Lucas Ward and Paul Hammant, I’m feeling much more comfortable and am generally okay with it. For small open source work, where you want a project quickly, but don’t want to spend a few hours with ant, I like it very much. For our large project, maven is fundamentally broken. Too much file copying is required, and manual intervention on a complex build is impossible, or extremely tedious. Running with our own build script we control could be dramatically more flexible.

On a mailing list Paul recently recommended a few other to consider:

Maven Dependency plugin
mvn dependency:tree
- shows me the transitive deps (say, “how the hell is MX4J being dragged into the build ?”)
mvn dependency:analyze
- show me unused, but declared dependencies.
There’s a ton of free capability to that plugin – http://maven.apache.org/plugins/maven-dependency-plugin/

An Atlassian dependency plugin
Atlassian wrote a plugin ( http://confluence.atlassian.com/display/DEV/FedEx+XI+-+Keeping+track+of+you+maven+dependencies ) that guards against undesired changing dependencies.

Maven Enforcer Plugin
A more intrusive way of guarding against deps that, say, should not make it into a war.  http://maven.apache.org/plugins/maven-enforcer-plugin/

What tips do you have? Please share them in the comments.

Written by Jonathan

May 14th, 2010 at 8:13 am

Posted in code, java, thoughtworks

Tagged with ,

Injecting HttpServletResponse into Spring MVC request scoped interceptors

without comments

Reading time: 3 – 5 minutes

I’ve been heavily refactoring a large team’s codebase that stuffed many critical dependencies into request attributes. You see request.getAttribute() throughout the code, which is counter to dependency injection’s principles. Instead of getting what you want, be given it to you. We are moving to a more idiomatic way of using spring to inject things such as our custom Session object.

However, there was a lifecycle mismatch between some singleton interceptors which needed the session object, which is only request scoped. We had everything working, except one of those interceptors also needed the response object as a collaborator.

I had a spring instantiated non-controller bean that is request scoped, and I want to inject in two collaborators:

HttpServletRequest (which can be injected)
HttpServletResponse (which fails injection)

Here is my bean:
<bean id="sessionFactory" scope="request">
<aop:scoped-proxy/> <!-- scoped-proxy to get around the singleton requests injection of a request scoped object problem -->
</bean>

Here are the autowired fields:
@Autowired HttpServletRequest request;
@Autowired HttpServletResponse response;

Can I inject the response object in this bean?

This is the error I was getting:
[INFO] Caused by: org.springframework.beans.factory.BeanCreationExce ption: Error creating bean with name ’scopedTarget.sessionFactory’: Autowiring of fields failed; nested exception is org.springframework.beans.factory.BeanCreationExce ption: Could not autowire field: javax.servlet.http.HttpServletResponse com.example.web.SessionRequiring$SessionFactory.re sponse; nested exception is org.springframework.beans.factory.NoSuchBeanDefini tionException: No unique bean of type [javax.servlet.http.HttpServletResponse] is defined: Unsatisfied dependency of type [interface javax.servlet.http.HttpServletResponse]: expected at least 1 matching bean
[INFO] at org.springframework.beans.factory.annotation.AutowiredAnnotationBeanPostProcessor.postProcessAfterIn stantiation(AutowiredAnnotationBeanPostProcessor.j ava:243)

I navigated to GenericWebApplicationContext, which delegates to WebApplicationContextUtils.registerWebApplicationS copes(beanFactory), which then registers ServletRequest.class and HttpSession.class as injectable. But not the response object.

So here’s what I did, which works for me:

In web.xml add a filter that will store every ServletResponse in a threadlocal, which will be accessible only from your FactoryBean.

  <filter>
        <filter-name>responseInScopeFilter</filter-name>
        <filter-class>org.springframework.web.filter.DelegatingFilterProxy</filter-class>
        <init-param>
            <param-name>targetBeanName</param-name>
            <param-value>responseInScopeFilter</param-value>
        </init-param>
  </filter>
  <filter-mapping>
        <filter-name>responseInScopeFilter</filter-name>
        <url-pattern>/*</url-pattern>
  </filter-mapping>

Create the filter:

    /**
     * We need a filter to capture the HttpServletResponse object such that it can be injectable elsewhere in request scope.
     */
   public static class ResponseInScopeFilter implements Filter {
        // not the most elegant, but our spring commiter friends suggested this way.
        private ThreadLocal<HttpServletResponse> responses = new ThreadLocal<HttpServletResponse>();
        public void init(FilterConfig filterConfig) throws ServletException { }

        public void doFilter(ServletRequest servletRequest, ServletResponse servletResponse, FilterChain chain) throws IOException, ServletException {
            HttpServletResponse response = (HttpServletResponse) servletResponse;
            responses.set(response);
            chain.doFilter(servletRequest, servletResponse);
            responses.remove();
        }

        /** Only to be used by the BeanFactory */
        private HttpServletResponse getHttpServletResponse() {
            return responses.get();
        }

        public void destroy() { }
    }

And also create the FactoryBean:

/** We need our own FactoryBean because Spring does not offer a way to
     * inject request scoped HttpServletResponse objects.
     * See http://forum.springsource.org/showthread.php?p=298179#post298179 */
    public static class HttpServletResponseFactoryBean implements FactoryBean {
        @Autowired ResponseInScopeFilter responseInScopeFilter;

        public Object getObject() throws Exception {
            return responseInScopeFilter.getHttpServletResponse();
        }

        public Class getObjectType() {
            return HttpServletResponse.class;
        }

        public boolean isSingleton() {
            return false;
        }
    }

Wire it all together in applicationContext.xml

<bean id="httpServletResponse" class="com.example.web.SessionRequiring$HttpServletResponseFactoryBean" scope="request"/>
<bean id="responseInScopeFilter" class="com.example.web.SessionRequiring$ResponseInScopeFilter"/>

Elsewhere I can now inject an HttpServletResponse object and it will be request scoped appropriately for this request. You can read more about this in my post on the spring forums.

Written by Jonathan

May 6th, 2010 at 9:32 pm

Posted in code, java, thoughtworks

Simplicity is Better for Deploying in Production Web Architectures

without comments

Reading time: 5 – 8 minutes

Engineering something to be scalable, highly available, and easily manageable has been the focus of much of my time recently. Last time I talked about spiderweb architecture, because it has attributes of scalability and high availability, yet comes with a hidden cost. Complexity.

Here is a fictional set of questions, and my responses for the application architecture.

Q: Why does complexity matter?
JAW: Because when your system is complex, there is less certainty. Logical branches in the possible state of a system mean more work for engineers to create a mental model, and decide what action to take. Complexity means there are more points of unique failure.

Q: But my team is really, really smart; my engineers can handle clever and complex mental models!
JAW: That wasn’t a question, but I do have a response. Given a team at any moment in time, there is a finite amount of complexity that the team can deal with. Complexity can be in the application’s logic, dealing with delivering business value. Or, it can be in non functional requirements. If the NFR’s can be met with lower complexity, this will translate directly to more business value. A team will grow in their ability to manage complexity as they understand more and more of it, and team size can increase. Although those productivity increases can be used for business value, or complex architectures. And often, NFR’s can be met while still achieving simplicity.

Q: So how do I deal with a large, complex application which needs an emergency fix on one of the small components?
JAW: Yes, I know the scenario. You want to make a small change into production, but it sounds less risky to only push one part. Here’s my recipe for success: make every deployment identical, and automated. (Ideally push into production from continuous builds, with automated testing.) In the event of an emergency push into production, alter the code from your version control tag, and deploy that as you would every other push. My colleague Paul Hammant call non-standard, risky pushes “white knuckle three-in-the-morning deployments.”

Don’t make the e-fix a one-off, non-standard production push. Have the entire system simple, and repeatable. With repeatability and automated repetition comes security. Very flexible (read: complex), extensible (read: rarely tested day to day) hooks can be built into a system, in order to make it possible to push just one small component into production. However in reality unused code becomes stale, and when a production emergency happens, people will be so scared to try these hooks. Or if they do, there is a greater risk of a misconfiguration, and failure. Which will necessitate a fix of the failed fix which tried to fix the original tiny defect. More complexity. Blowing the original availability requirements out of the water.

Q: So, what is simplicity?
JAW: My definition says: Simplicity is the preference of fewer combinatorial states a system can be in. Choose defaults over

I recently read a quote from High Scalability, which I think gives a good definition of what simplicity is (emphasis added):

“Keep it simple! Simplicity allows you to rearchitect more quickly so you can respond to problems. It’s true that nobody really knows what simplicity is, but if you aren’t afraid to make changes then that’s a good sign simplicity is happening.

[Caveat: some complexity makes sense, it's just too much in the wrong places increases risk. And there is a threshold everyone needs to find: how much risk, how much flexibility, and how much energy to devote to reducing the risk while keeping high flexibility.]

Update: Thanks to Lucas, for pointing me to an interesting article about second life scaling:

A preconditon of modern manufacturing, the concept of interchangeable parts that can help simplify the lower layers of an application stack, isn’t always embraced as a virtue. A common behavior of small teams on a tight budget is to tightly fit the building blocks of their system to the task at hand. It’s not uncommon to use different hardware configurations for the webservers, load balancers (more bandwidth), batch jobs (more memory), databases (more of everything), development machines (cheaper hardware), and so on. If more batch machines are suddenly needed, they’ll probably have to be purchased new, which takes time. Keeping lots of extra hardware on site for a large number of machine configurations becomes very expensive very quickly. This is fine for a small system with fixed needs, but the needs of a growing system will change unpredictably. When a system is changing, the more heavily interchangeable the parts are, the more quickly the team can respond to failures or new demands.

In the hardware example above, if the configurations had been standardized into two types (say Small and Large), then it would be possible to muster spare hardware and re-provision as demand evolved over time. This approach saves time and allows flexibility, and there are other advantages: standardized systems are easy to deploy in batches, because they do not need assigned roles ahead of time. They are easier to service and replace. Their strengths and weaknesses can be studied in detail.

All well and good for hardware, but in a hosted environment this sort of thing is abstracted away anyway, so it would seem to be a non-issue. Or is it? Again using the example above, replace “hardware” with “OS image” and many of the same issues arise: an environment where different components depend on different software stacks creates additional maintenance and deployment headaches and opportunities for error. The same could be said for programming languages, software libraries, network topologies, monitoring setups, and even access privileges.

The reason that interchangeable parts become a key scaling issue is that a complex, highly heterogeneous environment saps a team’s productivity (and/or a system’s reliability) to an ever-greater degree as the system grows. (Especially if the team is also growing, and new developers are introducing new favorite tools.) The problems start small, and grow quietly. Therefore, a great long-term investment is to take a step back and ask, “what parts can we standardize? Where are there differences between systems which we can eliminate? Are the specialized outliers truly justified?” A growth environment is a good opportunity to standardize on a few components for future expansion, and gradually deprecate the exceptions.

Written by Jonathan

January 30th, 2010 at 1:20 pm

Posted in architecture, java

Can you spot Java Puzzler in this snippet?

without comments

Reading time: < 1 minute

I ran across this last week. It was marvelous when we saw what was happening, but entirely puzzling at first.

Boolean someFlag = complicatedLogicToFigureOutFlag();
Person person = new Person(someFlag);

Any signs for concern? How about if Person’s constructor is:

Person(boolean someFlag) {
    this.someFlag = someFlag;
}

Any warning signs?

Will it compile?

Read more for the full puzzler.

Read the rest of this entry »

Written by Jonathan

September 30th, 2009 at 2:51 pm

Posted in code, java, puzzle

Large Web App Architecture: Yes to Thicker Stack on One Hardware Node, No to Beautiful “Redundant” Spiderwebs

with 4 comments

Reading time: 4 – 7 minutes

My last client our team worked with had a large ecommerce operation. Yearly revenue in the new site is in the high single digit billions of dollars. This necessitates extremely high availability. I will draw an initially favorable looking configuration for this high availability (“beautiful spiderwebs”), but then tear it apart and suggest an alternative (“Thicker Stack on One Hardware”).

1. “Beautiful Spiderwebs” – Often Not Recommended

Here’s one common way people could implement high availability. Notice how there are always multiple routes available for servicing a request. If one BIG IP goes down, there is another to help. And this could be doubled with multiple data centers, failed over with DNS.

The visible redundancy and complexity in one diagram may be appealing. One can run through scenarios in order to make sure that yes, we can actually survive any failure and the ecommerce will not stop.

not recommended spiderweb tiers

So then what could make this my Not Recommended option?

2. Martin’s Reminder how to Think About Nodes

Fowler reminded us in Patterns of Enterprise Application Architecture how to look at distribution and tiers. For some reason people keep wanting to have certain “machines running certain services” and just make a few service calls to stitch up all the services you need. If you’re concerned about performance, though, you’re a looking for punishment. Remote calls are several orders of magnitude greater than in process, or calls within the same machine. And this architectural preference is rarely necessary.

One might lead to the first design with the logic of: “We can run each component on a separate box. If one component gets too busy we add extra boxes for it so we can load-balance our app.” Is that a good idea?

fowler distributed objects not recommended

The above is not recommended:

A procedure call between two separate processes is orders of magnitude slower [than in-process]. Make that a process running on another machine and you can add another order of magnitude or two, depending on the network topography involved. [PoEAA Ch 7]

This leads into his First Law of Distributed Object Design: Don’t distribute your objects!

The solution?

Put all the classes into a single process and then run multiple copies of that process on the various nodes. That way each process uses local calls to get the job done and thus does things faster. You can also use fine- grained interfaces for all the classes within the process and thus get better maintainability with a simpler programming model. [PoEAA Ch 7]

fowler clustered application recommended

3. “Strive for Thicker Stack on One Hardware Node” – Recommended

Observe the recommended approach below. There is still an external load balancer, but after a request is routed to an Apache/Nginx/etc front end, you’re all on one* machine.

If one tier fails on a node, pull the whole node out from rotation. Replace it. And re-enter it in the mix.

Your companies teams have worked together to be able to deploy modular services. So when your ecommerce site needs a merchant gateway processing service, you can include that (library or binary) and run it locally on your node, making a call through to it as needed.

Services are also simpler to deploy, upgrade and monitor as there are fewer processes and fewer differently-configured machines.

recommended thicker nodes tiers

(* I understand there may be the occasional exception for remote calls that need to be made to other machines. Possibly databases, mcached obviously third party hosted services, but the point is most everything else need not be remote.)

4. But, Practically Speaking How Far Do We Go?

A caveat first: these benefits get pronounced as you have more and more nodes. (And thus, more and more complex of spiderwebs of unnecessary failover).

Should there be a database server running on each node? Probably not at first. There is a maintenance associated with that. But after sharding your database and running with replication, why not? This way if a node fails, you simply pull it out and replace it with a functioning one.

5. Checklist of Takeaway Lessons

  1. Keep it local. Local calls orders of magnitude faster than remote calls.
  2. Make services modular so they don’t need to be remote, yet still have all the organizational benefits of separate teams.
  3. Simplicity in node-level-redundancy is preferred over tier-level-redundancy.

Often, people think of high availability with terms such as the following: Round Robin, Load Balancing, and Failover. What do you think of? Leave a comment below with how you meet the trade-offs of designing for HA as well as architectural decisions of low latency.

Written by Jonathan

August 19th, 2009 at 12:55 am

Posted in architecture, code, java

Tagged with ,