Why is this version 2.0?
Until now, all FireBrick firmware has run on processors with only one or two cores.
But in future models, including the FB9000 (pre-orders open any day now) the architecture is multi core. This requires particular development attention to ensure all the cores work together; sharing information and resources but without blocking each other or stepping on each other's toes.
To support this, much of our code has needed to be fundamentally restructured.
The previous code was optimised for the abilities and constraints of the earlier, simpler, hardware and was carefully tuned over many years to extract maximum performance out of that hardware. But now, software has to be adapted to create a release that will run on both old and new FireBricks.
The FireBrick firmware is unusual because applications are built directly on top of our own operating system. Re-working a complex application to work efficiently on multiple cores would be a considerable undertaking on any operating system (major OS's have of course been dealing with the difficulties of multi-core operation for many years). But for FireBrick, we've also needed to develop, write and test the fundamental building blocks that the OS will provide for synchronisation across cores, as well as then re-writing and re-testing all the applications themselves.
This had to be done in a way that didn't harm performance on existing platforms or add too much of a maintenance burden going forward, which necessitated finding approaches that would result in efficient code for every model. This was by far the hardest part of the re-write - especially since some of our older platforms use the ARM 5 architecture which lacks a lot of the low level instructions (as this architecture isn't supported by Linux for SMP, it is fairly uncharted territory).
Our first attempts involved the liberal use of fairly conventional read/write mutexes around shared resources, but on some platforms this was unbearably slow. Our implementation has now been through several iterations - each one improving performance - until we have arrived at the fast and reliable code that is now in the 2.0 release. A large percentage of the system has been re-written, and hardly any components within our code-base were able to stay unchanged.
How has all this been tested?
This release has gone through the most rigorous (and longest) testing cycle of any FireBrick release to date.
A flexible automated test environment has been built to seek out problems and prove them eliminated. This runs a consistent set of tests (including throughput checks) on every new build with a variety of configurations across all platforms. Alongside the automated tests, we also use this environment to run bespoke manual tests to reproduce and resolve specific problems.
Additionally, we have been running prototype hardware in our production environment. The first customers were steered to a 9000 LNS running this software in January 2023. This allowed us to discover several real-life production issues, which have been fixed and extensively tested.
This release (on various FireBrick platforms) is routing all our broadband traffic. A number of our other internal systems have also been upgraded to prototype hardware, so for instance we know that real world BGP scenarios are working well with the new build.
We've also been testing new features for 2.0 including added functionality in the user interface and improved support for IPv6 features.
The result of all this is that, despite all the internal changes, 2.0 is our most stable and feature-rich software to date. And in general usage, it seems as fast or faster than the previous release across all platforms.