Ask Your Question
0

Netconf scalability question

asked 2015-09-28 07:36:18 -0700

updated 2015-09-28 08:13:31 -0700

Hello, I am running scalability tests for Netconf and can't scale above 450 nodes. The tests are doing the following operations:

  1. Build a docker image based on a docker file. The images is running ConfD.
  2. Start ODL.
  3. Spawn 400 containers running ConfD. ODL initiate a connection to the ConfD instance and the schema within the container are then interpreted by ODL.
  4. I wait until all 400 nodes are connected (with a 200s timeout).
  5. Rinse and repeat.

I have observed that above 450 or so, ODL shuts down RestConf and the Netconf nodes are in "Connecting" state. The machine running theses tests has 200GB RAM and has very high I/O. I am also seeing this in the logs: https://gist.github.com/sniggel/fa337d204a4d245a63cf

I have started looking into increasing the fixed and flexible threadpools but looking at the logs, I can see that there are "too many files open" despite the machine having a maximum of 25 millions limit. Also, ConfD has a couple of Yang files that are synced with ODL.

Any input would be appreciated.

edit retag flag offensive close merge delete

3 answers

Sort by ยป oldest newest most voted
0

answered 2015-09-28 14:36:50 -0700

updated 2015-09-28 14:37:08 -0700

As it turns out, the per process ulimit was set to 1024 but the global file limit to 25m. Increasing it to a reasonable number fixed the issue.

Cheers

edit flag offensive delete publish link more
0

answered 2015-09-28 09:30:12 -0700

jamoluhrsen gravatar image

One thing to do, to get more info, is to attach a profiler (I've used jvisual to debug openflowplugin issues before). If you hit the fd limit and you have 25M limit configured, then something is probably not getting cleaned up.

edit flag offensive delete publish link more
0

answered 2015-09-28 23:36:14 -0700

Tony Tkacik gravatar image

uLimit will definitelly help, in Lithium (not sure about Helium) NETCONF also shares schema for same devices,

Memory overhead per device is rather low - in our testing we were continuosly able to get ~10k sessions from netconf devices to one controller instance with 2G heap.

edit flag offensive delete publish link more
Login/Signup to Answer

Question Tools

Follow
1 follower

Stats

Asked: 2015-09-28 07:36:18 -0700

Seen: 182 times

Last updated: Sep 28 '15