Can Netty efficiently handle scores of outgoing connections as a client?

I'm creating a client-server relationship whereby a single client will be connected to an arbitrary number of servers using persistent TCP connections. The actual number of servers is as-of-yet undetermined, but the design goal is to shoot for 1000.

I found an example using direct Java NIO that nearly completely matches my mental model of how this could work:

http://drdobbs.com/jvm/184406242

In general, it opens up all of the channels and adds them to a single thread monitoring java.nio.channels.Selector. The use of the Selector, in particular, is what allows this to scale far better than using the standard thread-per-channel.

I would rather use a (slightly) higher level socket framework like Netty, than direct Java NIO. Unfortunately, I have not been able to determine how Netty would handle a case like this. That is, the examples and discussions I've found all tend to center around the server side, with accepting scores of concurrent connections.

But what about doing this from the client side? If I create a large number of channels and just wait on their events, how is Netty going to handle this at the back-end?

2012-04-03 21:53
by granroth

If Netty will let you open an outgoing connection, register it for OPCONNECT, and handle the OPCONNECT event to register for OP_READ, everything after that is the same whether server or client - user207421 2012-04-04 01:36

Netty will use up to x worker threads that handle the work for you. Each worker thread will have one Selector that is used to register Channels to it. The number of used workers is configurable and by default 2 * cpu-count.

2012-04-04 06:10
by Norman Maurer

Ah. That sounds perfect. Are the channels distributed evenly between the Selectors, then? Say we have a 16-core system and 1024 open connections. That implies 32 worker threads (by default), each serving 32 Channels - granroth 2012-04-04 16:35

its kind of round-robin, so its not exact distributed across the - Norman Maurer 2012-04-04 16:57

This isn't a direct answer to your question but I hope it is helpful nonetheless. Below, I describe a way for you to determine the answer that you are looking for. This is something that I recently did myself for an upcoming project.

Compared to OIO (Old IO) the asynchronous nature of the Netty framework and NIO will indeed provide much better memory and CPU usage characteristics for your application. The way buffers are handled in Netty will also be of benefit as it will help you to avoid copying byte buffers. The point is that all of the thread pool and NIO details will be handled for you allowing you to focus on your business logic. You mentioned the NIO Selector and you will benefit from that; the nice thing about Netty is that you get the benefits without having to worry about that implementation yourself because it is already done for you.

My understanding of the client side is that it is very similar to the server side and should provide you with commensurate performance gains (as long as your business logic doesn't introduce any performance issues).

My advice would be to throw together a prototype that more or less does what you want. Leave out any time consuming details and just add in the basic Netty handlers that you need to make something that works.

Then I would use jmeter to invoke your client to apply load to the server and client. Using something like jconsole or jvisualvm will show you the performance characteristics of the client and server under load. You could also try jprobe. You can add a listener in jmeter that will indicate the throughput. I would advise to use jmeter in server mode, the client on another machine and the server on yet another. This is a bit of up front work but if you decide to move forward you will have these tools ready to go for further testing as your proceed.

I suspect a decent Netty implementation that doesn't introduce any extraneous poorly performing components will give you the performance characteristics you are looking for, but, the only way to know for sure is to measure the system under the expected load.

You need to define what the expected load looks like and the desired performance characteristics under such load. Given these inputs you can measure your system to find out if it will meet your expectations. I personally don't think anyone can tell you if it will behave in the desired manner. You have to measure it. It's the only reliable way to know if the system can meet your needs.

I would rather use a (slightly) higher level socket framework like Netty, than direct Java NIO.

This is the correct approach. You can try implementing your own NIO server and client but why do that when you have the benefit of a highly refined framework at your fingertips already?

2012-04-04 04:24
by Matt Friedman

As you can see in the example from Netty's doc [http://netty.io/docs/stable/guide/html/#start.9][1] you can control exactly the number of worker threads (meaning the number of underlying selectors) on the Client side. Netty solves a numbers of issues that are very hard to handle in a simple way such as NIO vs SSL, and have a lot of default encoder/decoder for Zip... etc. I started using Netty a few week ago and it was quite fast to came into. (I recommend dowloading the project with all the example code inside, there is a lot of documentation in it that can not be found on the url above.

            ChannelFactory factory = new NioClientSocketChannelFactory(
                      Executors.newCachedThreadPool(),
                      Executors.newCachedThreadPool());

            ClientBootstrap bootstrap = new ClientBootstrap(factory);

            bootstrap.setPipelineFactory(new ChannelPipelineFactory() {
             public ChannelPipeline getPipeline() {
                    return Channels.pipeline(new TimeClientHandler());
             }
            });

            bootstrap.setOption("tcpNoDelay", true);
            bootstrap.setOption("keepAlive", true);

            bootstrap.connect(new InetSocketAddress(host, port));

Good luck,

Renaud

2012-04-04 12:13
by RenaudBlue

Thank you! Your response and Norman's response both pointed out the core bit of knowledge that I was missing -- that worker threads each have an underlying Selector. That is key - granroth 2012-04-04 16:52

You welcome! ( - RenaudBlue 2012-04-13 10:59