Dwarfguard Performance

Dwarfguard is designed to operate as a high-performance single-node application. For versions 0.x, the device namespace for a single deployment is limited to roughly 65000 devices, not allowing more devices than this limit. With the first public release called 0.6.0 Early 1 we started performance testing, focused on both stability and benchmarking purposes. For every minor and major release, we will publish performance testing protocol.

First table captures for how many devices each of the released version was tested and a link to the protocol with notes

Release Tested for Protocol
0.6.0 Early 1 10000 Link
0.7.0 Early 2 30000 Link
0.8.0 Early 3 40000 Link

The application itself consists of a several elements where the critical element (performance-wise) is the dwarfgd, the service daemon. For HTTPS-enabled deployments, another performance determining element is the speed of encryption. The encryption could either be handled by Apache Web Server directly in the deployment environment (further communicating with dwarfgd via a localhost socket) or on a different box / VM.

Running an encryption termination reverse proxy is especially effective if you have multiple Dwarfguard deployments as the significant CPU load caused by the encryption can be offloaded to either a better scaled VM or a dedicated box. This enables you to share resources that would otherwise be left idling. This approach is extremely effective for handling peak loads as usually, the peaks for different deployments happen in different moments. Though, a system admin needs to carefully set up the proxy and access permissions to not compromise security.

Back to dwarfgd - the daemon consists of the main controlling thread, request handling threads taking care of processing data from devices and additional threads taking care of everything else. Following table captures possible setting for Dwarfguard releases, together with socket backlog size, which could be another performance influencing piece of puzzle.

Release Type RH thread (min) RH thread (max) Other threads (min) Other threads (max) Socket backlog Notes
0.8.0 early 3 Static 2 8 6 8 64 RH threads increase requires service restart. 2 is enough for 40000 devices
0.7.0 early 2 Static 2 2 6 8 64 2 RH threads confirmed enough for 30000 devices
0.6.0 'early' Static 2 2 6 8 64 2 RH threads determined enough for up to 10000 devices
0.5.x 'beta4' Dynamic 1 10 5 7 256 Dynamic to measure perf. for first release
0.4.x 'beta3' Dynamic 1 10 4 6 -
0.3.x 'beta2' Static 4 4 0 0 -


While CPU power can limit how many requests can be handled per a request-handling thread, it is not a hard limit for the total number of devices in the system. When some of the requests cannot be processed in time, they time out and the device will return after a while to try again. The only hard limit is the amount of available memory. If the RAM segment of dwarfgd would outgrow the available memory, Dwarfguard would cease working (as any other application). Therefore, memory considerations must be done when scaling for a deployment of a number of devices.

Following table captures dwarfgd starting RAM and avg + max memory consumed per device attached.

Release Startup RAM (dwarfgd) Startup memory (VM) avg RAM per device max RAM per device Notes
0.8.0 early 3 < 50 MiB 215 MiB 13 KiB RAM usage is 60-70% of 0.7.0. See comparison sheet in perf. protocol
0.7.0 early 2 < 40 MiB 20 KiB See perf. protocol for details on RSS / memory
0.6.0 early 1 < 40 MiB 150 MiB 12 KiB 60 KiB
0.5.0 beta4 < 40 MiB 220 MiB 40 KiB 150 KiB
0.4.0 beta3 100 MiB < 400 MiB 150 KiB 300 KiB


Unfortunately, the above numbers are not enough to compute memory requirements of your deployment precisely as the requirements do scale with multiple aspects. What we can offer as a guide for you are

  1. recommended maximum numbers of devices per container sizing in the table "recommended maximum"
  2. sample data from our stability and benchmark testing showing e.g. complete system memory utilization after the test (emulating particular number of devices) was concluded

Based on that you may find suitable container (RAM) size by selecting the testing container with the closes higher number of devices you intend to handle by your Dwarfguard deployment. The table is describing results for Dwarfguard 0.8.0.

Container CPUs / T CPU arch. RAM / MiB Devices RAM used interval % load per CPU (load / # of CPU)
C1 1 / 1 Xeon E-2250 512 1000 48 % 200 s 7 %
C2 2 / 2 Xeon E-2250 1024 3000 33 % 200 s 8 %
C3 4 / 4 Xeon E-2250 2048 10000 11 % 200 s 13 %
C4 8 / 8 Xeon E-2250 4096 30000 5 % 200 s 15 %
H1 8 / 16 Core i7 32768 30000 2 % 200 s 10 %

Just a remark - if you are using a VM container with a limited memory, pay close attention to the systemd - a journal systemd service can eat gigabytes of memory if not configured properly. Especially for a VM container this may cause Linux kernel killing random processes - like DB, Apache, dwarfgd or even systemd itself, resulting in possible service interruptions and data loss. (please note this applies to default Debian GNU/Linux installations for at least versions 10 and 11).

Following table brings recommended rough maximal number of devices per memory size assigned to a VM. The calculation is done on the basis that the OS installation is dedicated to operate Dwarfguard and there is no other service provided by the VM. You can find more performance test results in performance testing document for particular Dwarfguard release. Please note that the table does factor in also other aspects like stability tests parameters when run against that particular memory limit. While during our tests Dwarfguard was able to handle much bigger number than the conservative recommended maximum, please stick to the 'better safe than sorry' strategy and scale according to the first table row (recommended maximum).

Also, while the RAM criteria seems to be independent on other criteria (traffic, CPU), it is not. The same machine configuration may be fine for a deployment with higher number of devices with the default data-push interval but failing to handle lower number of devices with the considerably shortened interval while failing could mean service interruptions and VM restarts. That's because the memory requirements scales with not only number of devices, but number of requests per second as well.

Following recommendations do apply for the default data-push interval of 260 seconds - actually, the testing performed uses 200 seconds interval to be on the safe side of that.

Release 0.8.0 Early 3 C1/512 MiB C2/1024 MiB C3/2048 MiB C4/4096 MiB Notes
recommended maximum 1000 3000 10 k 40 k Release 0.8 is recommended up to 40 k devices
max RAM-wise (stability test) 2081 9143 90 k 566 k Computed maximum, not recommended
max CPU-wise (benchmark) 20085 37992 45443 51829 Buffer included (can take more in default setting)


The last consideration to be taken is network traffic. To fully understand the amount of data and number of requests dwarfgd needs to handle, the following table shows additional calculations for selected number of devices, based on real measurements. Please note that in reality, the requests from devices are not evenly spread out in time. On the contrary - peaks do occur, so the illustrated load could be much higher or much lower during particular fraction of a minute or second.

devices contacts/minute 1 contact every relative CPU units (devs / total) Traffic-regular/daily (In) Traffic/sec (In)
1 0.23 260 seconds 1 / 201 1.3 MiB 15 B/s
10 2.3 26 seconds 10 / 211 13 MiB 0.2 KiB/s
100 23 2.6 secs 100 / 310 133 MiB 1.5 KiB/s
300 69 ~1 sec 300 / 530 400 MiB 5 KiB/s
1000 230 0.26 secs 1000 / 1300 1.3 GiB 15 KiB/s
3000 692 86 millisec 3000 / 3500 4 GiB 46 KiB/s
10000 2307 26 millisec 10000 / 11200 13.3 GiB 153 KiB/s
60000 13846 4 millisec 60000 / 66200 80 GiB 1 MiB/s


Now, while the traffic and RAM figures are self-explaining, what about the CPU units?

The CPU comparison is hard to make because there are vast differences between physical CPUs, virtual machines and containers. The numbers in the table are good only to understand roughly the ratio between different deployments - e.g. if your deployment is loaded about 50% CPU with x devices, you can make a guess how many devices it is able to handle with 100% CPU utilization.

For every release, Dwarf Technologies runs performance testing. You can inspect results and see for yourself but our HW is not identical to yours and we use artificial testing methods to emulate lots of devices so you should take the measured results with reservation and better stick to the recommendations.

Should you need more information when considering scaling your deployment or if you are curious about results of the stability or benchmark tests performed, download the performance test protocols linked from the first table.

If you are considering where and how to deploy, please look into Deploy options, especially when inexperienced with deploying Linux server applications - Dwarf Technologies offers managing your Dwarfguard deployment in a safe and controlled environment or providing advice or assistance with setting up your own deployment.