We’ve already discussed the green benefits of building minimal software in the previous article. As well as reducing waste by limiting features to those that are valuable, developing minimal software also means using minimal architectures. It’s not unusual to see applications designed to support theoretical futures rather than implemented efficiently to support the known existing scenarios. Challenging the necessity of NFRs will help you build minimal architecture, as will a culture of developing minimally.
Below are some examples of minimal architecture decisions you may take as an architect:
- Don’t distribute until you need to – a modular monolith is often a great place to start and will reduce complexity until required
- Select lightweight container management tools such as Docker Swarm over heavyweight container management if suitable for your use case
- Replace a long-running service with a simple Function as a Service
- For pub-sub messaging, do you need a Kafka cluster, or at your expected medium-term scale, could you use Redis, for example?
Much like minimal requirements, minimal architectures provide many other benefits outside of software carbon intensity; lower cost, complexity, and maintenance effort, for example.
An overly sparse API may lead to client code needing to make multiple requests to a server. This ‘chattiness’ is highly inefficient as each call has an amount of processing, memory, and networking overhead, each of which will add to your software’s carbon intensity.
Conversely, a particularly verbose API may also be inefficient as some or perhaps even most of the data will have been retrieved, communicated, and processed only to be discarded as superfluous by the client code.
Getting the correct balance will require a good deal of knowledge about the actual API usage, which may have multiple clients. Futureproofing an API may be prudent and avoid future development costs but may prove unnecessary and have a negative carbon impact on an ongoing basis.
There are several choices in developing an API approach that will impact efficiency. API architecture (e.g., REST, GraphQL, gRPC), protocol (TCP, UDP, custom wire protocols), and the data format used for request/response of your APIs, each have a balance of efficiency against other factors such as ease of debugging and development effort.
GraphQL can give you the ability to only request the data you need rather than a uniform response (lessening the burden of API verbosity). REST is ubiquitous, quick to develop, more general purpose, and potentially less efficient. gRPC is lightweight, efficient, harder to debug, and more effortful to develop.
Typical data formats used with enterprise APIs are human-readable such as JSON or XML. Often humans do not read the data, certainly not in production use. This introduces inefficiency (verbosity) in favor of simplicity but provides a clear advantage during development and debugging. Serialised data formats can increase processing and data transfer efficiency, for example, Protocol Buffers used by gRPC.
Examples of Green Architecture Decisions
At the time of writing, there is a small but building list of green software delivery considerations, patterns and practices in modern software delivery that can be used as a resource when designing software. This has been recently introduced by the green software foundation and can be found here. This list builds on the small set of patterns and practices introduced with the green software principles and is an open-source resource that can be submitted to by anyone.
Further to this, the following list provides a small set of examples of common architectural decisions that can make a significant difference to the green credentials of the software you build.
Choice of programming language
Some languages are naturally more efficient than others. Of course, there is much more detail than that statement would suggest. Still, research shows that compiled languages tend to be the fastest and most energy-efficient, followed by virtual machine, and finally, interpreted languages.
According to research, Python performs in the bottom five of 27 languages. This finding is notable due to the language’s popularity (at the time of writing, it holds the top position in the TIOBE language index) and its prominent use in large-scale data platforms and analytics. As a result, Python applications can potentially generate colossal energy use in large-scale processing, storage, and network transfer. It’s also very easy to introduce inefficiency into these solutions, given their distributed and complex nature.
Public cloud, choice of provider, and region
Public cloud infrastructure is significantly more energy efficient than on-prem or enterprise data centers. Cloud-native services also utilise hardware more efficiently. Depending on the region, the carbon intensity of the energy supply can be better depending on the mix of energy supply that powers the local grid.
Cloud providers each have differing sustainability commitments and progress. Therefore, your cloud provider choice and their operating region(s) can make a difference to your software’s green credentials. The creators behind Cloud Carbon Footprint have developed a methodology to calculate cloud emissions and Climatiq has used this approach to effectively visualise how different data centers compare in terms of carbon intensity.
It's worth noting that cloud vendor claims of 100% renewable energy can be a little misleading. As you can see in the Climatiq article, cloud data centers have varying carbon intensity levels, sometimes a very high (e.g., Indonesia). While cloud vendors put a great deal of investment into renewable energy, it is often not possible to feed this renewable energy into the grids powering their data centers – it is instead a form of carbon offset. Local grids still produce carbon to generate their energy and this carbon intensity should be used for decision-making, rather than an assumption that “100% renewable energy” means zero carbon by-products.
Platform as a Service (PaaS) and Serverless (e.g., Function as a Service (FaaS))
PaaS and Serverless public cloud services are highly efficient uses of hardware. Cloud vendors have designed them to make the best possible use of available hardware resources, leading to far less idle time and significantly reduced energy use in many cases.
But it is worth noting that using FaaS can lead to the development of overly small units with high amounts of network chatter and unnecessary complication, especially at enterprise scale. This is worth considering when developing large-scale software highly reliant on FaaS.
Containers can provide an effective way to maximize the utilisation of available hardware if orchestrated effectively. Using cloud-managed container orchestration (or serverless containers) can further increase this benefit.
You should take care to maximize the efficiency of the containers and their orchestration, requiring expert input in development and operation.
Scheduling and batch vs. real-time
With the ease of access to cloud-managed services that enable event-driven response, there is a natural lean towards these patterns where it may be unnecessary given non-functional requirements.
For those things that do not need this immediacy, and may happen thousands or millions of times over, consider scheduling batch activities to run at another time or even in another region.
By being aware of energy’s carbon intensity at different times of the day and in different locations, you can benefit from both batch execution efficiencies and lower emissions. Note that you will need to use forecast marginal carbon intensity rather than the combined carbon intensity of the grid. Marginal power sources will often have a greater intensity than combined.
In advanced cases, you could develop schedulers to automate scheduling based on batch latency requirements and marginal carbon intensity in different cloud regions and times. Such a scheduler will automate running the task in the most carbon efficient way given boundaries for the latency of return. GCP uses this approach for its batch scheduling, for example.
Another example is a paper on building carbon awareness into the Kubernetes Scheduler. APIs such as Watttime or Electricity Maps are available to query carbon intensity. They can also predict real-time, forecast, and marginal historical intensity using machine learning.
There are a host of other things you can do, some of them small but easy to overlook, such as cleaning up stored objects when no longer in use and setting appropriate retention periods for data and backups.
The AWS Well-Architected Sustainability Pillar documentation has some specific examples worth noting, and we’d recommend reading their documentation.
The next article in the series will explore some of the things that you can do build greener software as a developer, covering topics including code efficiency, quality processes, environments, and CI/CD.