X-Road 7.4.0 Is Here

The year 2023 is ending, and it's time to publish X-Road 7.4.0! The previous version, 7.3.0, was released in June, and it provided a new Central Server UI and management REST API. Version 7.4.0 includes several improvements for the Central Server, but the Security Server hasn’t been forgotten either.

The beta version is already out, and the official release version will be published in December 2023. The release notes are available here. Please note that the beta version doesn’t include all the changes mentioned in the release notes.

Let’s see what the highlights of the version 7.4.0 are!

Publishing global configuration over HTTPS

Until now, the Central Server and Configuration Proxy have only supported publishing global configuration over HTTP. Since the global configuration is signed, its integrity and authenticity are guaranteed despite lacking HTTPS support. Nevertheless, securing the global configuration download with HTTPS helps to maintain the privacy of the global configuration data.

A new private key and a self-signed TLS certificate are created automatically when installing a new Central Server or upgrading an existing installation from an older version. After the installation or upgrade, the Central Server Administrator must manually apply for a TLS certificate from a trusted Certificate Authority (CA) and then configure the certificate. This step is required because the Security Server does not trust the new automatically generated self-signed certificate by default. The Security Server supports disabling certificate verification, but disabling it in production environments is not recommended.

Starting from version 7.4.0, the global configuration will be available on the Central Server and Configuration Proxy ports 80 and 443. The old HTTP endpoint on port 80 guarantees that older Security Servers not supporting downloading global configuration over HTTPS continue to work normally. Also, the new Security Servers that support downloading global configuration over HTTPS will fall back to HTTP if the TLS certificate of the new HTTPS endpoint is not configured correctly.

Changes in the global configuration download ports may cause changes to firewall configuration on the Central Server, Configuration Proxy and Security Server. Inbound traffic from the Security Servers to port 443 must be allowed on the Central Server and Configuration Proxy. Similarly, outbound traffic to the Central Server and Configuration Proxy port 443 must be allowed on the Security Servers.

When upgrading the Central Server from version < 7.4.0 to version >= 7.4.0, the configuration anchor must be re-generated, distributed to all the Security Server administrators through an external channel (e.g., email) and imported to each Security Server to enable downloading global configuration over HTTPS. Without the new configuration anchor Security Servers will continue to download global configuration over HTTP.

Enforcing new key creation when generating CSR on the Security Server

A good security practice is creating a new key pair when applying for a new authentication or a sign certificate on the Security Server. Starting from version 7.4.0, the Security Server supports enforcing new key creation when generating a certificate sign request (CSR) on the Security Server. The feature is enabled by default, which means that it's not possible to generate a new CSR for a key that already has an existing certificate. The feature can be disabled by setting the “proxy-ui-api.allow-csr-for-key-with-certificate” system property to true on the Security Server.

Configuring a minimum required client Security Server version

From version 7.4.0, service providers can configure a minimum required X-Road software version for client Security Servers. It means that client Security Servers older than the configured version cannot access services anymore.

Service providers can configure the required minimum version using the “proxy.server-min-supported-client-version” system property. The property has no value by default, meaning a minimum version hasn't been set. Instead, when the value is set, all the minor and patch versions starting from the configured version are approved.

X-Road Operators may set this value in the country-specific meta packages, e.g., Estonia and Finland are setting the value to the three latest versions, which means that requests that are originating from client Security Servers older than version 7.2.0 are not accepted anymore in the Estonian and Finnish X-Road ecosystems. The value is rolling, meaning it changes for every X-Road minor version, e.g., when version 7.5.0 is released, the minimum supported version is 7.3.0.

Changes in default client information system communications ports

The default client information system inbound communication ports are changed from 80 to 8080 and from 443 to 8443 on Ubuntu-based Security Servers. In this way, the default ports are consistent regardless of the hosting operating system.

The change is only applied to new installations, and existing ones are unaffected by it. Also, if custom ports have been configured locally by the Security Server administrator, they are unaffected.

X-Road Operators may set the default port values in the country-specific meta packages. The Estonian ecosystem uses ports 80 and 443 by default for new installations after the change. In other words, the Estonian X-Road users organisations are unaffected by the change.

LDAP group mapping support

The Central Server and Security Server administrator user permissions are managed using X-Road-specific Linux groups mapped to X-Road user roles. An administrator must be a member of the X-Road-specific groups to access the management UI.

Starting from version 7.4.0, it’s possible to map additional Linux groups to X-Road user roles using the “proxy-ui-api.complementary-user-role-mappings” and “admin-service.complementary-user-role-mappings” system properties. It enables granting administrators permissions to the management UI using existing Linux groups instead of adding administrators to the X-Road-specific groups. In this way, existing Linux groups can be mapped to the X-Road user roles, and there's no need to change users' groups.

Rotating global configuration sign keys

The Central Server global configuration sign keys are included in the configuration anchor imported to the Security Server. Using the information in the anchor, the Security Server can verify the global configuration signature. Suppose the sign keys are rotated. In that case, a new configuration anchor must be generated on the Central Server, distributed to all the Security Server administrators through an external channel (e.g., email) and imported to each Security Server. Only after that, the changes are applied by each Security Server.

Starting from version 7.4.0, the Central Server global configuration sign keys are included in the global configuration. Thanks to this, rotating the sign keys is possible without importing a new version of the configuration anchor to each Security Server. Instead, the Security Server gets the updated configuration information from the global configuration directly and applies it immediately.

Importing the configuration anchor to the Security Server manually is also possible. It's required when the Security Server is initialised for the first time and when a Security Server has been offline or turned off when the global configuration sign key has been rotated.

Disabling a subsystem temporarily

Disabling a subsystem temporarily enables the Security Server administrator to disable a subsystem so that all the services added under the subsystem cannot receive requests. If the same subsystem is registered on multiple Security Servers, disabling the subsystem on one Security Server doesn’t affect the others. When a subsystem is disabled, the association between the subsystem and the Security Server where the subsystem is disabled is not visible to other Security Servers in the ecosystem. This applies to the local and federated ecosystems.

Instead, disabling a subsystem doesn't change any configuration items regarding the subsystem or its services. It makes the association between the Security Server and subsystem invisible to other Security Servers so that they cannot send requests to it. Similarly, a disabled subsystem cannot be used as a service client either. 

Disabling and enabling a subsystem has a delay of a couple of minutes when the default configuration is used. During the delay, the subsystem may still receive service requests. However, the requests are not forwarded to the service provider information systems by the Security Server. Instead, all received requests will fail and an error message is returned to the service consumer. This is because the subsystem state is maintained on the Central Server, and updating it requires the Security Server to communicate with the Central Server. The length of the delay depends on the global configuration generation interval on the Central Server and the global configuration download interval on the Security Server. Other Security Servers become aware of a change in a subsystem’s state only after they download an updated version of the global configuration. Therefore, they may try to contact services under a disabled subsystem during the delay period.

Changing the Security Server address

The Security Server address is added to the Central Server when the authentication certificate is registered. Before version 7.4.0, updating the address afterwards required the Security Server administrator to contact the Central Server administrator over some external channel, e.g., email. Instead, starting from version 7.4.0, updating the address using the Security Server administrator UI or management REST API is possible.

The Security Server administrator can update the Security Server address by using the Security Server administrator UI and management REST API. The change is approved automatically by the Central Server and doesn't require manual approval from the Central Server administrator. By default, the change becomes visible to all the Security Servers in the ecosystem within a couple of minutes. However, the length of the delay depends on the global configuration generation interval on the Central Server and the global configuration download interval on the Security Server.

Migrate from Akka to gRPC

In September 2022, Lightbend announced that Akka, one of the third-party open-source libraries that X-Road heavily depends on, would change its licensing model. The new licensing model is commercial, but open-source projects meeting specific criteria may apply for an exception. However, the conditions of the license granted to open-source projects differ from the Apache 2.0 open-source license that Akka used before the change. From X-Road's perspective, the new license includes some unfavourable conditions, so Akka has been replaced with gRPC in X-Road.

Because of the change, any Akka-related system properties are not supported anymore starting from version 7.4.0, e.g., “<component>.akka.remote.artery.advanced.maximum-frame-size”. If Akka-related system properties have been used to override Akka’s default configuration values, they should be manually removed from the configuration.

Technical changes and upgrades

Besides functional changes and new features, several more technical changes have also taken place under the hood. The UI framework has been upgraded from Vue 2 to Vue 3, and the Spring Boot backend framework has been upgraded from version 2 to 3.

Also, the supported Java version has been upgraded from Java 11 to Java 17. The new Java version is installed automatically on other supported platforms except Red Hat Enterprise Linux 7 (RHEL7). On RHEL7, the Security Server administrator must manually configure a new package repository. More information about the upgrade process will be available in the release notes.

Other improvements and updates 

Besides the already-mentioned features, version 7.4.0 includes many other minor improvements and updates. For example, a new command line tool to verify message log archive hash chains, remove deprecated rate limiting parameters on the Security Server, improve the Central Server UI, publish REST services using OpenAPI 3.1, replace the SHA-1 hashing algorithm with SHA-256, and much more. To fully understand all the changes, please review the release notes document.

What’s next?

Initially, support for automated certificate management through the Automatic Certificate Management Environment (ACME) protocol was scheduled for version 7.4.0. Unfortunately, support for ACME didn't make it to version 7.4.0, but it will be included in version 7.5.0 instead. Other changes that will be introduced in version 7.5.0 include but are not limited to, support for Ubuntu 24.04 LTS and support for Red Hat Enterprise Linux (RHEL) 9. Version 7.5.0 will be released during Q2 in 2024.

At the same time, when developing X-Road 7, NIIS is already working on the next X-Road major version – X-Road 8. If you’re interested in hearing more about the topic, please check out the talk that I delivered at the X-Road Community Event 2023 in September. More news regarding X-Road 8 will be published soon. Stay tuned!

Is X-Road a Data Space Technology?

Data spaces are a hot topic right now in the field of data exchange and interoperability. However, it's not a new concept since its roots date back to 2005. The original definition was somewhat technical, but over the years, the term has expanded to cover all four layers of interoperability defined by the European Interoperability Framework (EIF).

Today, the term data space has several different definitions depending on whom you ask. Multiple international initiatives are working on data spaces to create common governance models, specifications, reference implementations, etc. Let's look at the definitions created by some of those initiatives.


”An infrastructure that enables data transactions between different data ecosystem parties based on the governance framework of that data space. Data space should be generic enough to support the implementation of multiple use cases.”

Source: Data Spaces Support Centre - DSSC Glossary - Version 1.0

”Decentralised, governed and standard-based structure to enable trustworthy data sharing between the data space participants on a voluntary basis. Data spaces may be purpose- or sector-specific, or cross-sectoral. Common European data spaces are a subset of data spaces within the scope of EU policies.”

Source: Gaia-X Glossary

 

”A Dataspace is a set of technical services that facilitate interoperable asset sharing between entities.”

Source: IDS Dataspace Protocol v0.8

 

”A data space is a federated data ecosystem within a certain application domain and based on shared policies and rules.”

Source: Sitra Rulebook for a Fair Data Economy

 

”A common European data space brings together relevant data infrastructures and governance frameworks in order to facilitate data pooling and sharing.”

Source: Commission Staff Working Document on Common European Data Spaces


When looking at the definitions, it’s clear that they are defining the same subject even though they’re approaching it from different angles. Also, it’s evident that data spaces are more than a technology since governance and policies play a significant role in the definitions.

The first version of X-Road was released in Estonia in 2001 – four years before the term data space appeared in an article for the first time. The technical implementation of X-Road has evolved and changed over the years, but the main concept has remained the same since the beginning. Also, X-Road is not only about technology; it also has an organisational model and trust framework. When comparing X-Road to the data space definitions, it’s easy to find many common factors and similarities. In fact, the definitions could very well be used to describe X-Road on a high level. But does that make X-Road a data space technology?

To answer the question, digging deeper into data spaces is necessary. Let's look at more detailed documentation and specifications to see how well X-Road is aligned with them.

The International Data Spaces Association

The International Data Spaces Association (IDSA) is a not-for-profit organisation aiming to create a global standard for international data spaces (IDS) and foster the global data spaces community. IDSA provides several data spaces related assets, for example:

Let’s study some of the assets provided by the IDSA and see how well X-Road is aligned with them. However, this will not be a complete comparison between the IDSA assets and X-Road. Instead, only selected parts of specific assets will be reviewed to highlight similarities and differences. Studying the source materials is recommended if you're interested in the topic in more detail.

Data space fundamentals

The IDS Rulebook defines foundational concepts of data space that seem to apply to X-Road on a high level:

  • Establishing trust

  • Data discoverability

  • Data contract negotiation

  • Data sharing & usage

  • Observability

  • Vocabularies and semantic models.

The IDS Rulebook and X-Road have the same vision of enabling data sovereignty and creating a trusted data-sharing ecosystem. They’re both based on the idea that trust is rooted in one or more trust anchors, trusted participants must meet specific policies that can vary between different ecosystems, and data sharing consists of peer-to-peer interactions between the participants.

Policies play an essential role in any data ecosystem. The IDS Rulebook defines multiple policy groups and the relationships between them. The same policy groups apply to X-Road, too, with the difference that X-Road doesn't define the structure and relationship of the policies or offer tools to express them. For example, the IDS specifications cover defining usage and contract policies and negotiating them as a part of the data exchange process. In X-Road, it’s up to the data exchange parties to agree on those policies using some external channel, e.g., email or service catalogue. Also, X-Road supports managing access to services. Still, the access policies are not technically associated with any other policies since the other policies are created and maintained outside of X-Road.

Operating models

Image 1. Data space operating models. Source.

The IDS Rulebook defines three different operating models for data spaces:

  • Centralised data space authority

  • Federated / distributed data space authority

  • Decentralised data space authority.

 In the context of X-Road, the data space authority is equal to the X-Road Governing Authority. X-Road supports centralised and federated authority. Centralised authority is the most typical alternative for X-Road, and it's the model that a single X-Road ecosystem uses. Instead, when two X-Road ecosystems are federated, the responsibility can be considered distributed between the authorities of the federated ecosystems. However, even in a federated setup, the authorities only have control over their ecosystem.

Conceptual model

Image 2. Data space conceptual model. Source.

The conceptual model defined by the IDS Dataspace Protocol is very similar to X-Road’s conceptual model. Doing a mapping between the models is straightforward:

  •  Dataspace – X-Road Ecosystem

  • Dataspace Authority – X-Road Governing Authority

  • Participant – Member

  • Participant Agent – Security Server

  • Identity Provider – Certificate Authority

  • Credential Issuer – N/A

  • Dataspace Registry – Central Server, Service Catalogue

The descriptions of the organisational entities (Dataspace Authority, Participant, Identity Provider) are pretty well aligned. Instead, there are differences in the descriptions of the technical entities (Participant Agent, Dataspace Registry). The technical differences are covered in the Components section.

Components

Image 3. Interaction between data space components. Source.

Comparing the components provides a good overview of the similarities and differences between the IDS-RAM and X-Road from a technical perspective. The IDS Identity Provider is not actively involved in the data exchange process, so it's omitted from the illustration.

IDS Identity Provider

The IDS Identity Provider consists of three complementary components:

  • Certificate Authorities (CAs) issue and manage technical identity claims.

  • The Dynamic Attribute Provisioning Service (DAPS) issues short-lived tokens providing information about Connectors.

  • The Participant Information Service (ParIS) provides business-related details on Participants.

The IDS Identity Provider combines X-Road's Certificate Authority and Central Server. In X-Road, the roles of the CA and DAPS are combined since Member and Security Server-related information is included in the certificates. At the same time, the IDS Identity Provider decouples the certificate and the associated information. In that way, it's possible to alter the attributes of a certificate, e.g., attach new attributes to a certificate, without reissuing the certificate.

In X-Road, the Central Server stores some business-related information about Members. In addition, the Service Catalogue is a complementary X-Road component that may provide additional information about Members. Therefore, the Central Server and Service Catalogue are equal to the ParIS.

IDS Metadata Broker

The IDS Metadata Broker is basically a registry of IDS Connectors, their capabilities, characteristics, and metadata of the data they offer. The IDS Metadata Broker provides endpoints for registering, publishing, maintaining, and querying the metadata. The X-Road Central Server has a similar role as a registry of Members and Security Servers, with the difference that the Central Server doesn't store any information about the data or services offered by Security Servers. Instead, the Service Catalogue is an optional X-Road component that provides information about the data and services.

IDS Connector

The IDS Connector is the point of access to a data space, providing standardised data exchange between Participants. The IDS Connector is connected to all the other data space components since it provides data and metadata to them. For example, the IDS Connector provides technical interface description, authentication mechanism, and associated data usage policies to the Metadata Broker and usage contracts to the Clearing House. Also, several other IDS components (App Sore, Metadata Broker, and Clearing House) are based on the IDS Connector architecture.

In X-Road, the component that provides access to an X-Road ecosystem is the Security Server. Like the IDS Connector, the Security Server provides secure and standardised data exchange between Members. The Security Server communicates with all the other components of an X-Road instance – either directly or indirectly.

IDS Clearing House

The IDS Clearing House provides a logging service that records relevant information for clearing, billing and usage control. For example, the information is used to provide a settlement service based on usage contracts which can help to automate payments between the data exchange parties. In addition, the logged usage control data can be used to validate access to resources. Technically, the IDS Clearing House consists of an IDS Connector.

X-Road doesn’t have a component that would be a direct match with the IDS Clearing House. However, the optional X-Road Metrics component provides some similar capabilities, e.g., collecting service usage statistics that can be used to automate payments, collecting service statistics. In addition, the Security Server provides a digitally signed and timestamped log record of each data exchange transaction that can be used to validate the transaction afterwards. However, the data exchange parties store the log records locally, and no third parties have access to them.

IDS App Store

The IDS App Store is a platform to distribute IDS Apps for IDS Connectors. Currently, X-Road doesn’t have a similar component.

IDS Vocabulary Hub

The IDS Vocabulary Hub is a vocabulary management platform for IDS use cases. It’s a platform to host, maintain, publish, and document vocabularies. It gives access to the vocabulary terms and their descriptions. The IDS Information Model is the lowest common denominator and can be extended with additional vocabularies. The IDS Vocabulary Hub communicates with IDS Connectors and infrastructure components.

The X-Road architecture doesn’t include a vocabulary component, but it goes without saying that interoperable data exchange requires semantic interoperability. Therefore, it’s up to the X-Road Governing Authority and data exchange parties to agree on where and how vocabularies are managed and stored. For example, the Finnish Digital Agency has developed an Interoperability Platform that can be used to maintain and publish common terminologies.

Conclusions 

After comparing data spaces and X-Road from various aspects, they have a lot in common on a high level. However, looking into the details shows that they also have many differences.

For example, the IDS specifications cover various aspects of data exchange, e.g., usage conditions and policies, contract negotiation, and data transfer process. They do not set restrictions to the data transfer protocol or wire protocol – the protocol used to transfer the data. In other words, any existing transfer protocol can be used, which enables using existing solutions and supporting many different transfer protocols. This is a big difference compared to X-Road, which has its own transfer protocol. Also, X-Road doesn’t technically cover the before-mentioned aspects (usage conditions and policies, contract negotiation) covered by the IDS specifications.

Let's return to the original question – can X-Road be considered a data space technology? The answer is no if X-Road is strictly compared to the IDS specifications, despite the similarities. The IDSA has a certification scheme for the IDS components, and X-Road would not meet the certification requirements. The IDS Testbed is a platform that can be used to conduct evaluations that ensure that an IDS component is implemented securely and conform to the relevant specifications. Components that pass the evaluation receive certification and are approved in the IDS ecosystems.

On the other hand, the European Commission considers data exchange ecosystems to be data spaces even if they are not based on technologies conforming to the IDS specifications, for example, the Once-Only Technical System (OOTS) and European Health Data Space (EHDS). Those ecosystems align with the common definition and characteristics for data spaces, but the technologies they use in the implementation do not conform to the IDS specifications. With the same logic, an X-Road ecosystem could also be considered a data space.

Multiple organisations and initiatives are developing specifications and standards for data spaces, e.g., International Data Spaces Association (IDSA), Gaia-X, and Fiware. The initiatives work together, aiming to keep the specifications and standards aligned and not create multiple overlapping or competing specifications. However, the work is currently in progress, and not everything is fully ready and aligned yet. Therefore, it might be too early to define what is a data space and what’s not based on the deliverables of a single actor. Nevertheless, in the long-run, common standards and specifications are the only way to achieve interoperability within and between data spaces.

New X-Road® Central Server UI and management REST API are here!

In recent years, the Security Server has experienced a total external and internal makeover. The process got started in 2019 when support for REST services was added. In 2020, the Security Server got a new user interface (UI) and a management REST API that enabled the automation of common configuration and maintenance tasks. Releasing X-Road 7 in 2021 enhanced the UI's look and feel and brought several other significant changes and improvements under the hood. While all these major changes have been implemented for the Security Server, the Central Server has received only some smaller updates. However, the Central Server has been remembered and will be the star of X-Road 7.3.0.

The beta version of X-Road 7.3.0 is already out, and the official release version will be published at the end of June 2023.

Easier administration and streamlined onboarding process

The most significant change in X-Road version 7.3.0 is the fully renewed Central Server UI. The new UI improves the usability and user experience of the Central Server. The new intuitive UI makes regular administrative tasks easier and supports streamlining the onboarding process of new X-Road members.

For example, complementary management requests for authentication certificates and client registration requests are no longer required. It's enough to send a registration request from the Security Server and approve it with two clicks on the Central Server. And like before, enabling automatic approval of registration requests makes the approval process fully automated.

Image 1. List pending management requests.

Image 2. Management request details.

Image 3. Approve management request.

Management REST API allows to automate tasks

Another significant change in X-Road version 7.3.0 is the brand-new Central Server management REST API. The API provides all the same functionalities as the UI and can be used to automate common maintenance and management tasks. Maintaining and operating the Central Server can be done more efficiently as configuration and maintenance tasks require less manual work. Also, the new UI uses the same API under the hood too.

The Central Server User Guide provides more information about the API, and the API's OpenAPI 3 description is available on GitHub. Access to the API is controlled using API keys that can be managed through the Central Server UI or through the API itself. In addition, access to the API can be restricted using IP filtering.

Changes in the architecture

The new UI and management REST API have also caused changes in the Central Server architecture and packaging. The previously existing Jetty (xroad-jetty) component has been replaced with the new UI and API (xroad-center), registration service (xroad-center-registration-service) and management service (xroad-center-management-service) components. These changes have affected Central Server’s log files, directories, software packages, and services. It’s strongly recommended that Central Server administrators study the details of these changes from the release notes before upgrading to version 7.3.0.

Image 4. Changes in the Central Server architecture - before version 7.3.0 (left) and starting from version 7.3.0 (right).

Wait, there’s more!

Even though the new Central Server UI and management REST API are the most significant and most visible changes in version 7.3.0, the new version contains many other new features, improvements, and fixes. Here’s a short overview of other changes included in the latest version.

  • Security improvements on the Central Server:

    • Encrypt backup files (opt-in)

    • Verify the integrity of backup files on restore.

  • Run all the X-Road components on Java 11. Remove support for Java 8.

  • Create a separate security hardening guide that provides information about hardening the Central Server and Security Server host configurations.

  • Implement configurable request rate and size limits for the Central Server REST API and management services.

  • Changes in allowed characters in X-Road system identifiers and improved validation of the identifiers.

  • Technology updates and a decrease in technical debt.  

The complete list of changes with more detailed descriptions is available in the release notes.

What’s next?

Implementing the new Central Server was a long process that required more time and effort than was initially expected. Unfortunately, it has caused postponing the implementation of some other new features. More changes to the Central Server are scheduled in the upcoming X-Road versions, but the focus will now shift to other roadmap items.

More information about the X-Road development roadmap is available here. More detailed information about the backlog items scheduled for version 7.4.0 is available here.

Third-party security experts have assessed the security of the new Central Server. However, should you have any findings, they can be reported through the newly launched X-Road Bug Bounty program.

Unravelling the Complexities of National Data Exchange Networks: A Network Science Approach

Introduction

This post is based on the findings from my research project titled "Graph Analysis of Dynamic National Data Exchange Networks."

In the age of relentless digital connectivity, understanding complex networks has become increasingly critical, spanning from social media platforms to the emerging world of blockchain technologies. X-Road, an established data exchange infrastructure, has been embraced by countries such as Estonia, Finland, Iceland, Colombia, Argentina, and Vietnam. Catering to millions of individuals, X-Road can be viewed as a complex network where government bodies, companies, non-profits, and various other organisations exchange data with one another.

In this post, we shall explore the intricacies of national data exchange networks through the lens of network science. By investigating the Estonian X-Road network (X-tee), I aimed to better understand the underlying patterns within data exchange networks and pinpoint potential areas for enhancement. Estonia has been collecting the network's transaction data (through the X-Road Metrics component, an open-source extension to X-Road) since 2016. The anonymised open data serves as a valuable starting point, and the insights derived could potentially be applicable to other nations as well.

Key Findings: A Network Science Approach

By analysing over 30 million data queries on the Estonian X-Road network, several key insights were obtained using network science analysis methods. The network shares common attributes with other real-world networks: 

  1. Sparsity: an overall low number of connections compared to the maximum possible connections among its members.

  2. Central giant component: a dominant connected subgraph in which a large fraction of the network's nodes/members are interconnected. 

  3. Power law distribution for parts of the network: revealing a small number of highly connected nodes and a large number of less connected ones. 

These identified characteristics suggest that the network is well-suited for further modelling and analysis using network science methodologies.

Some of the key findings from the analysis:

  • Public sector organisations, particularly governmental institutions, form the backbone of the data exchange infrastructure, being the most connected and active members of the network

  • Nighttime is the prime time for mass data queries from government organisations on people and companies, particularly for tax authorities and bankruptcy bailiffs. During the daytime, service sectors like healthcare flourish, with the Health Insurance Fund and hospitals among the most active X-tee members.

  • The network's most active members could be grouped into five distinct communities: 

    • Healthcare

    • IT and Infrastructure

    • Social Security and Taxes

    • Internal Affairs and Transport

    • Education, Defence, and Environment.

50 most active member clustered into 5 communities

Figure 1. 50 most active member clustered into 5 communities. See the full size image.

Though the community groupings may not be flawless, it's crucial to emphasise that these communities were identified solely by analysing query volumes between network members throughout the day. The content of the data queries, which is not publicly available, was not factored into the community detection process. This implies meaningful relationships between network members and groups of members could be discovered even without contextual information.

Implications and Future Directions

The findings from this analysis project have several implications. 

First, the research demonstrates the value of network science in modelling and analysing data exchange networks. This paves the way for more advanced prediction models and real-time monitoring tools. By discerning interaction patterns and activity distribution, decision-makers can enhance system performance, addressing both cybersecurity and economic concerns.

Second, the research highlights the importance of the public sector in driving data exchange, as well as the diverse range of services that rely on these networks. Understanding these interactions could help policymakers optimise resource allocation and improve the overall functioning of public services.

Lastly, the ability to accurately identify communities within the network suggests that further insights can be gained by examining the data transaction flows between these groups. This could potentially lead to a better understanding of the relationships between different sectors and the dynamics of the data economy.

Limitations and Challenges

While the findings of this research project provide valuable insights into the intricacies of national data exchange networks, it is essential to acknowledge some limitations that could impact the conclusions drawn from the analysis.

  1. Lack of contextual information: The reliance on transaction data, without the content of the data queries, limits the depth of understanding of the relationships between network members. Including contextual information could provide a more comprehensive view of how different sectors interact within the network.

  2. Generalizability: The analysis is based on the Estonian X-Road network, and the findings may not be directly applicable to other countries or networks with distinct characteristics or data exchange practices.

  3. Possible biases: The data or methodology used in the analysis may introduce biases that could affect the outcomes and conclusions. Further investigation may be required to identify and address these biases to ensure the reliability and validity of the findings.

  4. Dynamic nature of data exchange networks: As networks evolve over time, the findings from this research may be impacted by changes in the network structure or the interactions between members. Periodic re-analysis or real-time monitoring would be needed to maintain an accurate understanding of the network dynamics.

  5. Need for further research: The findings presented in this blog post warrant additional investigation to validate or expand on the conclusions. Future research could explore the impact of incorporating contextual information, compare data exchange networks across countries, or investigate the relationships between different sectors and the dynamics of the data economy more thoroughly.

By acknowledging and addressing these limitations, the research can be further refined, and the understanding of national data exchange networks can be deepened, ultimately contributing to more effective decision-making and policy development.

Conclusion

In conclusion, this research project demonstrated the power of network science in shedding light on the complex world of national data exchange networks. As an increasing number of countries adopt data exchange solutions like X-Road, understanding their intricacies will be crucial in improving decision-making, reducing bureaucracy, and enhancing the overall happiness of citizens. The methodologies and insights derived from this project could serve as a valuable foundation for future work in this domain and may also encourage more countries and municipalities to adopt secure data exchange layers, ultimately benefiting millions of people around the world.

Andrius Matšenas, a recent Mathematics graduate from the University of Southampton, has a strong interest in network science, which he delved into in his BSc thesis – the basis for this blog post. With a passion for designing software products, Andrius co-founded Stardust Network, where he led a team to develop apps that empower users to take control of their personal data. He also gained valuable product development experience as a Product Analyst at NFTPort. Find out more: matsenas.ee

Database, dataset, data service, or service? Getting to know X-Road services.

Services are essential building materials of any data exchange ecosystem, and X-Road is no exception. Therefore, the number of available services is one of the key metrics when measuring the effects and benefits of an X-Road ecosystem.

For clarity, in X-Road, services are technical interfaces between information systems that are not used by end-users directly. Instead, end-users communicate with X-Road services indirectly through other systems and platforms, e.g., using the state portal that fetches information from multiple base registries over X-Road.

Besides the number of services, other important metrics are the number of member organisations and connected information systems, and the amount of data and queries exchanged between the members. However, from the ecosystem perspective, the number of available services may be the most essential metric. According to a study by Kristjan Vassil in 2016, the discrete threshold for ecosystem growth appears at 50 data repositories.

X-Road comes with tools to measure the number of available services in an ecosystem. Since X-Road is based on a decentralized architecture, the X-Road member organizations maintain information about available services locally on their Security Servers. There's no centralized list of available services by default, but the X-Road Operator may collect the data from Security Servers and publish it in a service catalog. For this purpose, the Security Server provides a set of built-in metadata services that can query metadata about the services published by different member organisations. However, interpreting the numbers requires some background knowledge of what the term service means in the context of X-Road. This blog post aims to explain the anatomy of service in X-Road.

Database, dataset, data service, or service?

Over the years, many different terms have been used to describe information systems in service providers' roles in X-Road. The term database was the most used for many years, while the term service has become the most common in recent years. Nevertheless, different terms are still used interchangeably.

Technically, X-Road doesn't distinguish between a database, data set, data service, or service. Also, there's no difference between a data service, a business service, or an aggregated service. Instead, X-Road divides services into SOAP, OpenAPI, and REST. In other words, dividing services into different categories is based on their technical characteristics rather than the type of the service. 

Identifying services

In X-Road, all services are identified using a unique service identifier string. The identifier is used to invoke services, and the number of available services in an X-Road ecosystem is counted by calculating the number of service identifiers. The service identifier includes information about the X-Road ecosystem, the organization owning the service, and the information system providing the service. However, the service identifier doesn't contain any information about the service category, but the built-in metadata services can be used to access the category information.

Connecting services

Publishing a service to X-Road requires that the organization owning the service has been registered as a member of an X-Road ecosystem and has access to a Security Server. First, the organization must complete an onboarding process to join an X-Road ecosystem. The organisation’s identity is verified by a trusted Certificate Authority (CA), and the X-Road Operator registers the organization. As a result, the organization is given an X-Road organization identifier.

Security Server is the organisation’s technical access point to the X-Road ecosystem. The organization may deploy its own Security Server, use a shared Security Server, or buy a Security Server as a service from a commercial service provider.

When an X-Road member organization has access to a Security Server, an information system providing a service must be connected to the X-Road ecosystem. In X-Road's terms, it means registering a new subsystem identifier on the Security Server(s) or utilizing an existing subsystem. A subsystem represents an information system or a logical group of information systems. The information system may be in a service consumer role, a service provider role, or both.

The service is then added under the subsystem. If the information system provides multiple services, all the services can be added under the same subsystem. Alternatively, various services published by the same information system can also be added under different subsystems. Access permissions to the services are defined on a service level, so whether the same or different subsystem is used doesn't affect them.

A service can be published on one or more Security Servers simultaneously, and one Security Server can publish multiple services owned by different organisations. For high availability, publishing services on multiple Security Servers is recommended. The Security Server supports high availability in two different ways: internal and external load balancing. The number of Security Servers where a service is published doesn’t affect the service identifier – it’s always the same.

On the other hand, it's also possible to publish the same service under two different service identifiers. For example, a free version of a service is published on a standalone Security Server with no high availability, and a paid version of the same service is published with a different service identifier on a Security Server cluster with external load balancing. That way, it's possible to provide two versions of the same service with different SLAs.

SOAP, OpenAPI, and REST services

X-Road supports three service categories: SOAP, OpenAPI, and REST. Regardless of the category, all the services are identified using a service identifier. However, the way how the services work and are managed varies between different categories.

SOAP

X-Road Message Protocol for SOAP defines how service consumers and service producers communicate with the Security Server. The protocol is based on SOAP profile 1.1. It comes with some X-Road-specific limitations and additional requirements, e.g., support for synchronous request-response operations only, some mandatory SOAP headers are required, and document/literal style SOAP body is required.

A common approach is to have an additional adapter service component between the Security Server and a SOAP client or service. The adapter service implements the X-Road Message Protocol for SOAP and converts all incoming/outgoing messages to/from the X-Road SOAP profile.

SOAP services are connected to the Security Server by providing a URL of a WSDL service description that may contain one or more SOAP service endpoints. The Security Server validates the structure of the WSDL description, but it doesn’t validate the WSDL against the service endpoint implementation.

Typically, a single SOAP endpoint represents a service that implements one action or procedure. Each endpoint has a unique service identifier. Also, endpoints can have multiple versions that have their own identifiers. For example, a WSDL description with four SOAP endpoints counts for four X-Road services since each endpoint gets its own X-Road service identifier.

Access permissions to SOAP services are managed on the service endpoint level. When a single WSDL contains multiple SOAP endpoints, access rights must be defined for each endpoint separately.

OpenAPI

Consuming and producing OpenAPI and REST services via X-Road is possible without an additional adapter service component. X-Road-specific information required by the Security Server (e.g., service client identifier, service provider identifier, message id, etc.) is transferred and processed so that existing REST-style services and service consumers can be connected to X-Road with minimal changes or no changes at all. This is achieved by transferring X-Road-specific information required by the Security Server in HTTP headers and URL parameters outside the message payload. The full details are available in the X-Road Message Protocol for REST document.

OpenAPI services are REST APIs that have an OpenAPI Specification available. The Security Server doesn't set any restrictions to the content type of the API messages, so the content type isn't limited to JSON only.

OpenAPI services are connected to the Security Server by providing a URL of an OpenAPI specification that describes a REST API with one or more API endpoints. The Security Server validates the structure of the OpenAPI specification, but it doesn’t validate the specification against the API implementation.

Typically, REST APIs are resource-centric, and endpoints are used to change the state of resources. However, REST APIs may also be RPC-style and implement actions or procedures. Either way, an OpenAPI service has a unique service identifier that covers all its API endpoints. For example, an OpenAPI service with four API endpoints counts for one X-Road service identifier.

A single OpenAPI service may support multiple API versions, or different API versions may be published as separate OpenAPI services. If the API version is included in the API base path URL (e.g., https://api.example.com/v1), a new OpenAPI service must be created for a new API version. Instead, if the API version isn't included in the base path URL (e.g., https://api.example.com), the same OpenAPI service can access different API versions. Access to the API works so that all the paths under the base path URL are accessible to service clients with sufficient access permissions. Therefore, the base path URL must be defined with caution. 

Access permissions to OpenAPI services can be managed on the API and API endpoint levels. Giving access on the API level means providing access to all the API endpoints by default. Also, if the API has endpoints not defined in the OpenAPI specification, they can be accessed too. Instead, giving access on the API endpoint level only provides access to specific endpoints. API endpoint level access permissions are defined using HTTP request method and path combination. Therefore, it is possible to define access rights for a single endpoint or alternatively for a subset of endpoints using wildcards. By default, the Security Server has a list of all the endpoints defined in the OpenAPI specification, but adding new endpoints manually is supported. Security Server’s access rights management only supports allowing access – explicitly denying access is not supported, e.g., allowing access to all endpoints on the API level and denying access to a single endpoint is not supported.

Besides access rights management, the Security Server does not use the endpoint-related information for anything else, e.g., the Security Server does not validate if an endpoint defined in a request by a client information system exists under an API or not. In other words, if a client information system has sufficient access rights to invoke an API endpoint, the Security Server forwards the request to the specified endpoint without any further validations.

REST

REST services are REST APIs that don’t have an OpenAPI specification available. REST services are connected to the Security Server by providing the API's base path URL. A REST service has a unique service identifier that covers all its API endpoints. For example, a REST API with four API endpoints counts for one X-Road service identifier.

REST services can be used to connect any HTTP-based endpoints to the Security Server. The Security Server doesn't set any restrictions to the content type of the API messages, so the content type isn't limited to JSON only. For example, a group of SOAP services could be connected to the Security Server using a REST service. It would be enough to provide the base URL of the SOAP services without a WSDL service description. In that case, the group of SOAP services would count for one X-Road service identifier.

Access permissions to REST services can be managed on the API and API endpoint levels. Giving access on the API level means providing access to all the API endpoints. Instead, giving access on the API endpoint level only allows access to specific endpoints. Since REST services don't have OpenAPI specification that defines the API endpoints, the endpoints must be added manually by the Security Server administrator if they need to be used in access rights management.

Metadata services

The Security Server provides a set of built-in methods that can be used to discover what services are available to them and download the machine-readable service descriptions. These methods are known as metadata services and are accessed using the service metadata protocol for SOAP and the service metadata protocol for REST. The metadata services have separate versions for SOAP and REST services.

Counting members, information systems, and services

The number of registered member organisations can easily be counted based on the member identifiers. Instead, the number of connected information systems can be calculated by the subsystem identifiers. However, the number of subsystem identifiers doesn't directly tell the number of connected information systems since a single information system may have multiple subsystems, and various information systems can share the same subsystem. Therefore, the number of connected information systems is only indicative.

With services, things get more complicated. The number of service identifiers doesn’t directly match the number of connected services. SOAP service endpoints are counted separately, while OpenAPI services are calculated on the API level rather than the API endpoint level. The same applies to REST services as well. However, X-Road supports counting the number of individual API endpoints, too, if they have been defined in an OpenAPI specification or manually.

Nevertheless, the service-related numbers don't say anything about the type of services. For example, they may provide simple access to data, execute a business process, provide service orchestration, etc. Therefore, additional analysis going behind the numbers is needed when comparing the available services of two X-Road ecosystems or evaluating the service coverage of a single ecosystem.

In addition to these metrics, it’s highly recommended the X-Road Operators implement the X-Road Metrics extension to get more detailed insights on the data exchange-related details in the X-Road ecosystem.

From connectivity between databases towards an ecosystem of ecosystems

From connectivity between databases towards an ecosystem of ecosystems

The challenges that X-Road® addressed in Estonia in 2001 included the lack of private networks – which resulted in developing secure data exchange over the public Internet – and connectivity (between databases) rather than data availability and discovery for the cross-use of data, including operational data of the ecosystem.

The operational data generated in over twenty X-Road environments worldwide is gradually emerging as a significant digital asset and should be better utilised for creating insights and optimal decisions, enhancing the processes and, thereby, the product.

In the future, X-Road as a connected ecosystem of ecosystems could get System of Systems (SoS) characteristics, which require thinking beyond questions usually associated with engineering. In this blog post, we’ll get food for thought about what the data-enabled future of X-Road could look like.

Additional Building Blocks of an X-Road Ecosystem

X-Road® is open-source software and ecosystem solution that provides unified and secure data exchange between organisations. X-Road is based on a distributed model, and it enables decentralised data management and data sovereignty within the ecosystem. Every organization is in full control of its data and services, and data is always exchanged directly between two trusted members without third parties having access to it.

X-Road operator is the owner of the X-Road ecosystem and is responsible for all the aspects of the operations. The responsibilities include defining regulations and practices, accepting new members, providing support for members, and operating the central components of the X-Road software. X-Road members are organizations that have joined the ecosystem and produce and/or consume services with other members. A member organization can be a service provider, a service consumer, or both. Also, a functioning X-Road ecosystem requires two types of trust services: 1) time-stamping authority (TSA) and 2) certification authority (CA). Trust service providers are organizations providing these services.

Technically, the X-Road core consists of a Central Server and Security Server that are the foundational building blocks of the ecosystem. These components are required together with the trust services to establish a trusted network of organisations and enable secure data exchange between its members.

Besides the core and trust services, building a functional and scalable ecosystem requires some additional building blocks that support the operations and use of the ecosystem. These building blocks provide member management and onboarding capabilities, service discovery, metrics collection and reporting, and technical monitoring. The X-Road core doesn't currently offer the capabilities, and therefore, additional building blocks are required. In general, implementing and maintaining these building blocks is the responsibility of the X-Road operator. Next, let's take a closer look at these building blocks.

Service management portal

A service management portal or a self-service portal is a web portal for managing the administrative details of the ecosystem membership. The administrative tasks related to the membership and its management are usually completed using the portal. For example, new members must first send a membership application and, once it has been approved, sign a membership agreement where they agree to follow the terms and conditions of the ecosystem. Also, the portal doesn't have to be limited to the administrative level, and it may cover some technical configuration steps, e.g., requesting certificates, and provide ecosystem-specific documentation and instructions. Depending on the implementation, some parts of the process may be automated, while others require manual input. Also, the portal may provide separate interfaces for the representatives of the member organisations and the X-Road operator.

The Central Server contains a registry of X-Road member organisations and their Security Servers. The information managed by the Central Server is technical and doesn't overlap with the data managed by the service management portal. Also, the service management portal may support the technical onboarding process by automating parts that are not directly covered by the Central Server, e.g., requesting certificates. Since there's a strong connection between the technical and administrative information, the portal and Central Server may be connected. However, a management REST API for the Central Server that enables a seamless integration is a work in progress currently.

An alternative to the service management portal is to manage the membership information in a system that members cannot access directly and use email (or some other channel) for communications. For example, maintain the member information in an Excel file or an internal wiki page, and receive service requests by email. It is a quick and easy way to get started with the ecosystem, but it doesn't scale very well, provides little support for automation, and is not very user-friendly. Therefore, implementing a service management portal is highly recommended.

Currently, there’s no off-the-shelf open-source component available that could be used as a service management portal for X-Road. In general, a service management portal is a custom component, and it may also support a broader range of digital services. Also, it is often connected to ecosystem-specific backend services and registries, such as business registry, service catalog, authorisation service, etc. The Estonian and Finnish service management portals are good examples of how the portal can be implemented.

Service catalog

A service catalog is a web portal that contains descriptions of all the services available in the ecosystem. The primary purpose of the service catalog is to provide a user-friendly channel to search and discover available services. Also, the catalog may provide additional features to support the use of the services, e.g., request access to a service, sign a service agreement, etc. The service catalog is targeted at both business and technical users.

When services are connected to X-Road, their service descriptions are published on the Security Server by the Security Server administrator. The service descriptions can then be accessed using a service discovery mechanism provided by X-Road. However, the mechanism is very technical and requires direct access to the Security Server's messaging interface. Also, getting a list of all services available in the ecosystem would require querying each Security Server separately. Therefore, a more user-friendly service catalog is needed.

When implementing the service catalog, collecting the service descriptions from the Security Servers can be automated. In that way, the descriptions need to be maintained in a single place only, and all the changes in the source are automatically updated to the catalog. Nevertheless, additional metadata must be manually added and maintained on the catalog by a service administrator representing the organisation owning the service. The metadata may include any information related to the service and its use, e.g., a more detailed description of the data set, terms and conditions, contact information, pricing information, SLAs, etc.  

The Estonian, Finnish and Icelandic (only in Icelandic) service catalogs serve as examples of how the catalog can be implemented. The source code of the Finnish catalog is freely available on GitHub, and it consists of two separate components: a service catalog portal and a collector to read the service descriptions from Security Servers and store them centrally. Currently, NIIS doesn’t provide a service catalog component for X-Road.

Reporting and metrics

Reporting and metrics mean collecting usage statistics and metrics from an X-Road ecosystem. The metrics include service usage statistics, response times, request sizes, service health data, etc. The metrics can be used to measure the size and activity of the ecosystem, and they also provide interesting information about the relationships between different member organisations and their services. The information enables the X-Road operator to overview the ecosystem's state and measure its growth. Thanks to the data, the X-Road operator can make informed decisions on the development and governance of the ecosystem.

To get an overview of the whole ecosystem, the raw metrics must first be read from all Security Servers and then stored and analysed centrally. However, it’s important to remember that the metrics do not contain data that is exchanged by the members, only metadata about the use of the services. Collecting the information requires installing the operational monitoring add-on on the Security Servers. The add-on collects the raw metrics data locally and makes it available through a query interface. Nevertheless, access to the interface is restricted so that only the X-Road operator can access the data of all member organisations. Regular members can access their data only.

X-Road Metrics is an open-source component maintained by NIIS to centrally collect, store, process, and publish the data provided by the operational monitoring add-on. X-Road Metrics consists of multiple modules, and its features include but are not limited to publishing the data as open data (from the Estonian ecosystem), generating a dependency graph of member organisations (from the Estonian ecosystem), and providing statistical reports to members. However, making other implementations that utilise the operational monitoring data is also possible since all the documentation and source code are publicly available. Of course, member organisations are free to use the data in their reporting and monitoring systems.

Technical monitoring

Technical monitoring means collecting technical monitoring and health data from an X-Road ecosystem. The data can be used to monitor the Security Server's health. The data includes system metrics (CPU load, free memory, available disk space, etc.), running processes list, X-Road version information, certificated details, etc. Also, the data can be used to recognise potential future problems and maintenance needs before they affect the operations of the Security Server, e.g., detect certificates that are about to expire, detect Security Server versions that are no longer supported. The information enables the X-Road operator to get an overview of the ecosystem’s health and monitor the maintenance of individual Security Servers.

At first, technical and operational monitoring may sound like the same thing or very similar. The difference is that technical monitoring concentrates on the Security Server while operational monitoring is about monitoring services connected to X-Road. However, the way how data is recorded on the Security Server and then collected and analyzed centrally is very similar.

To technically monitor the whole ecosystem, the raw monitoring data must first be read from all Security Servers and then stored and analysed centrally. The technical monitoring data doesn’t include sensitive information, only technical monitoring data. Also, the Security Server administrator can configure the data set that can be collected centrally. Collecting the information requires installing the environmental monitoring add-on on the Security Servers. The add-on collects the raw data locally and makes it available through a query interface. Like with operational monitoring, access to the interface is restricted so that only the X-Road operator can access the data of all Security Servers. Regular members can access their own Security Servers’ data only.

A common approach is to use existing monitoring tools and platforms to centrally store, analyse and visualize the technical monitoring data, e.g., Elasticsearch and Kibana. However, an X-Road-specific component is needed to read raw data from Security Servers. The Finnish Digital Agency has implemented such a component, and they've published it on GitHub under the MIT license. Currently, NIIS doesn’t provide a central environmental monitoring component for X-Road that could be used to monitor the ecosystem's health.

Where do I find the specifications?

Service management portal, service catalog, reporting and metrics, and technical monitoring building blocks offer capabilities that aren’t currently provided by the X-Road core. Those capabilities are not required when setting up a new X-Road ecosystem, but they certainly make operating and developing the ecosystem easier. First and foremost, they provide the X-Road operator with additional tools for informed decision-making and automating management processes.

The building blocks mentioned in this blog post are described on a conceptual level, and there's no formal specification for them. Therefore, every implementation of the building blocks is different and may not always provide the same set of features. However, how the building blocks are technically connected to X-Road must be based on the X-Road protocols and interfaces. Therefore, replacing one implementation with another should be possible if connections to other backend systems are ignored.

Currently, NIIS provides the implementation of the reporting and metrics building block in the form of X-Road Metrics. Also, implementations of the service catalog and technical monitoring building blocks are available as open-source. The service management portal is the only building block that doesn’t have open-source implementations available. More implementations are likely to become available in the future, and the technical specifications of the concepts will be defined in more detail.

The Message Room Concept in Practise

This is a series of blog posts about the Message Room concept. The first part provides an introduction to synchronous and asynchronous data exchange. The second part concentrates on life-event-based services and potential implementation alternatives. Three alternative implementation approaches are discussed in more detail in the third part.

In my previous blog post, I covered five different ways how the Message Room concept could be implemented. In this blog post, I will explain in more detail how NIIS has approached three of the five alternatives at a more practical level. During the last two years, NIIS has conducted multiple research and development activities that have concentrated on alternatives 1-3 (built-in, integrated, connected).

Integrating Apache Kafka into X-Road (integrated)

In collaboration with the University of Tartu, NIIS conducted a research project on supporting event-driven architecture in the context of X-Road. The study consisted of two parts that were completed in 2020-2021. The final report is available in the X-Road Document Library.

The first part of the project produced a report (requirement analysis and feasibility study) and a proof-of-concept implementation which brought a subset of Apache Kafka's capabilities into X-Road. The second part studied Apache Kafka's integration into X-Road further and explored topics that must be covered in a production level integration, for example, high availability, authentication, access rights management, and service discovery. Implementing Kafka management operations in X-Road was out of the project’s scope.

Image 1. Integrating Apache Kafka into X-Road (integrated).

The integration is based on the idea that the existing X-Road protocols are used for a handshake to establish an asynchronous communication channel between the message exchange parties. The asynchronous communication is implemented by the new XGate add-on developed in the project. First, the service consumer sends a regular X-Road request to the service provider using the X-Road Message Protocol for REST (1.). The request is transmitted between the Security Servers using the X-Road Message Transport Protocol. The service provider receives the message, instantiates an interface for asynchronous communication, and adds routing information in the response (2.). The service consumer reads the routing information from the response, establishes a connection to the interface (3.), and reads data from it (4.). More detailed description of the implementation is available in the final report.

The adapter service approach (connected)

X-Road-Kafka Adapter is an adapter component that connects Apache Kafka topics to X-Road. The Adapter supports both producing and consuming data over a simple REST API. The producers publish data to a topic, and the consumers poll the topic and pull data from it. The Adapter sits between the Security Server and Apache Kafka and converts messages between the X-Road message protocol for REST and Kafka protocol. However, the Adapter doesn’t support streaming. 

Image 2. The adapter service approach (connected).

In practice, both producers and consumers may use Kafka's native API directly, or alternatively, they can use a REST API provided by the Adapter over X-Road. In real life, producers and consumers owned by the organisation running Kafka would probably use the native API, whereas external producers and consumers would connect to Kafka through X-Road. However, all the Kafka management and maintenance operations must be done using the native API since the Adapter only supports a limited subset of operations for producing and consuming messages.

More detailed information about the Adapter is available on GitHub. The Adapter is currently on a proof-of-concept level and requires further development before it can be used to run production workloads. The source code is licensed under the MIT open-source license and anyone interested in the Adapter is welcome to contribute to its development.

The X-Road way (built-in)

NIIS is currently working on a proof-of-concept (PoC) level implementation of the built-in approach. The goal of the PoC is to implement the Message Room concept as a Security Server proxy add-on (like messagelog, metaservice, op-monitoring, etc.). In that way, the implementation is modular and can be installed on selected Security Servers only. However, the add-on needs to be installed only on Security Servers that are acting as publishers. No code changes are required on Security Servers acting as subscribers, which means that all existing supported Security Server versions can act as subscribers.

It must be noted that the features included in the PoC are just a narrow subset of potential features of a production-level implementation. The PoC aims to test the concept with minimum viable features, and therefore, many features and functionalities are left out on purpose. The PoC implementation covers the following functionality:

  • Push-push publish/subscribe model.

    • Publishers push messages to a Message Room.

    • A Message Room pushes messages to subscribers - subscribers don't need to poll the Message Room.

  • One publisher per Message Room.

  • One publisher Security Server per Message Room.

  • Multiple subscribers per Message Room.

  • Message Rooms are public - anyone can subscribe to them.

  • Security Server provides the required interfaces to:

    • Publish messages to a Message Room.

    • Subscribe to a Message Room.

    • Unsubscribe from a Message Room.

  • Support for federation.

  • Message Rooms are content-type and payload agnostic.

    • Messages that are published to a Message Room can be of any content-type, e.g., XML, JSON, text, binary, etc.

  • All X-Road security guarantees (except access control for Message Rooms) are supported.

The following restrictions apply to the PoC implementation:

  • Pulling data from a Message Room is not supported.

    • In push-pull model publishers push messages to a Message Room, and subscribers pull the messages from the Message Room.

  • Private Message Rooms are not supported.

    • It's not possible to control who's allowed to subscribe to a Message Room.

  • Internal load-balancing is not supported.

    • It's not possible to publish messages to a Message Room from multiple Security Servers.

    • A Message Room is coupled with a single Security Server.

  • No error handling.

    • If the recipient is not available, the message is lost.

    • If publisher's Security Server crashes, (some) messages are lost.

  • No support for service discovery.

    • There's no automated way to discover what Message Rooms are available.

    • Potential subscribers must know the subsystem code of the Message Room when subscribing to it.

  • No changes to the Security Server UI and management REST API.

    • If new configuration items are introduced, they're values are configured using configuration files or database queries.

  • Subscribe and unsubscribe interfaces support only REST. SOAP is not supported.

The Message Room add-on

The Message Room PoC is implemented as a Security Server add-on.

Image 3. The X-Road way (built-in).

The add-on includes three new interfaces:

  • Publish - publish messages to a Message Room.

  • Subscribe - subscribe to a Message Room.

  • Unsubscribe - unsubscribe from a Message Room. 

The publish interface is used by internal clients to publish messages to a Message Room. The publisher of the message is defined using the "X-Road-Client" HTTP header. The message body is fully data format-agnostic, just like the REST interface. The Message Room where the message is published is the same as the client subsystem. However, only subsystems with a special Message Room status can be used as a Message Room. 

The subscribe and unsubscribe interfaces are used by external clients to manage their subscriptions to a Message Room. The subscribe interface is used to subscribe to a Message Room. Similarly, the unsubscribe interface is used to unsubscribe from a Message Room. In the PoC implementation, the interfaces are accessible by anyone, and there's no access control to them. For example, listMethods and listAllowedMethods metaservices work in the same way.

The message body of the subscribe interface must contain the service ID where messages published to the Message Room are sent. Also, the service must be owned by the same client who sends the subscribe message. The same applies to unsubscribe requests, too - the sender of the request must be the owner of the service specified in the request body. Also, subscribers must allow the publisher to send messages to the service specified in the subscribe message.

Image 4. The message flow using the built-in approach.

The basic message flow for publishing messages is explained in the diagram above:

  1. A publisher publishes a message to a Message Room using the publish endpoint.

  2. The proxy stores the message in memory and returns an acknowledgment message to the publisher.

  3. A scheduled Message Room Processor reads new unprocessed messages from memory and reads the subscriber service IDs from the serverconf database.

  4. The Message Room Processor sends the message to the subscribers as a regular X-Road message. For example, if there are 5  subscribers, 5 messages - one for each subscriber - are sent. The Message Room subsystem is used as the sending client, and the messages are signed with the publisher member's signing key. Each message is logged independently by the proxy. Steps 6-9 are repeated for each subscriber. If there are no subscribers, the message is removed without further processing.

  5. The proxy sends the message to a subscriber.

  6. The subscriber's Security Server processes the message and forwards it to the service (subscriber) defined in the request.

  7. The service returns a confirmation that it has received the message.

  8. The confirmation is returned as a regular X-Road response.

  9. The Message Room Processor receives the response. The content of the response is ignored. However, if the response contains an error, the error is logged in the proxy log.

What’s next?

Based on the results provided by different approaches, the built-in approach has proven to be the most prominent. It can offer the same security guarantees that X-Road currently provides, including authentication, identity management, message logging, signing, and timestamping. Also, it can be expanded to a decentralized publish-subscribe implementation that enables many-to-many communication without a centralized message broker. Therefore, the next step is to test the PoC implementation of the built-in approach with a couple of selected use cases in Estonia and Finland during the first half of 2022. More information about the potential use cases is available in the previous blog post by Petri Kettunen.

Nevertheless, the NIIS members have not decided whether a production-level implementation of the Message Room concept will be included in X-Road. The PoC use cases implemented in Estonia and Finland will provide valuable information to support the decision-making. Once the PoC use case implementations have been completed, a decision regarding the production level implementation will be taken by the NIIS members. Meanwhile, NIIS participates in the implementation of the PoC use-cases and continues to develop the concept further.

The Message Room Concept Implementation Alternatives

This is a series of blog posts about the Message Room concept. The first part provides an introduction to synchronous and asynchronous data exchange. The second part concentrates on life-event-based services and potential implementation alternatives. Three alternative implementation approaches are discussed in more detail in the third part.

The Message Room concept was first introduced by Kristo Vaher, the government CTO of Estonia, in his paper Next Generation Digital Government Architecture in 2020. However, in the paper, the name X-Room was also used about the concept. In the latter design work conducted by NIIS, the name Message Room has been used, and therefore, I'm going to use it from here on.

According to the Next Generation Digital Government Architecture paper, the Message Room concept provides different administration sectors with an asynchronous communication channel that is the technical enabler for implementing life event-based services. In practice, the implementation would follow the publish-subscribe pattern, and it would be based on X-Road, including all X-Road's existing security guarantees. In that way, the existing infrastructure and data exchange ecosystem are utilized instead of reinventing the wheel and building everything from scratch.

What are life event-based services?

The idea behind life event-based services is that an event happens in a person's life and is registered by an information system. Then the information system notifies other information systems about the event that have registered their interest to receive updates on that specific event type. Based on the event, various processes are then triggered to provide the citizen with services related to the event. In this way, it is possible to offer or suggest services to citizens automatically instead of the citizens having to apply for them separately. For example, social benefits could be automatically offered to parents when a child is born.

What is a Message Room?

Generally, Message Room is an asynchronous messaging concept that decouples message producers and consumers and enables publishing messages and events to multiple consumers. Instead of sending messages directly between two information systems, message producers publish their messages to a Message Room with any number of consumers. The number of message publishers is not limited either, and a single Message Room can have one or more publishers.

Image 1. The Message Room concept.

A Message Room can be public or private. Anyone can publish and/or consume messages from public rooms, but private rooms can be accessed by authorized parties only. It is also possible that only authorized parties can publish messages, but consuming is allowed for anyone or vice versa.

Technically, a Message Room is an implementation of the publish-subscribe communication pattern. In other words, a Message Room can be considered a topic with multiple subscribers. A Message Room enables one-to-many and many-to-many communication.

Implementation alternatives

From a technical perspective, there are different ways to implement the Message Room concept. Every alternative has its pros and cons, and also the constraints and supported features vary between the alternatives. The implementation alternatives can be divided into five high-level categories:

  • built-in

  • integrated

  • connected

  • standalone

  • standardization.

It must be noted that the categories are not mutually exclusive, and they're partly overlapping. A potential outcome might very well be a combination of multiple categories. 

Built-in

The idea of the built-in approach is that required features are implemented around the existing X-Road concepts, components, and protocols. In practice, it means extending the current protocols, introducing new entities, and expanding the functionalities of different components. Everything is designed and implemented using the "X-Road way," which means that the implementation provides all the same security guarantees that are provided to synchronous messages. At the same time, the implementation also has to deal with the constraints caused by the same security guarantees.

Integrated

Integrated means taking an existing open-source messaging solution (e.g., Apache Kafka, RabbitMQ) and integrating it into X-Road. In this way, the existing solution provides most of the messaging features, and there's no need to implement them separately. Some changes/additions to the X-Road protocols are likely to be required, but the data exchange inside a message room is based on the protocol(s) supported by the selected solution. Also, the integration covers most of the management-related tasks so that they can be completed using X-Road provided components, e.g., the Security Server UI and management API.

However, integrating the solution into X-Road requires potentially a significant effort, and there's a tight coupling between X-Road and the selected solution. Another downside is that X-Road becomes highly dependent on an external solution which future development and roadmap are out of NIIS's control. Also, it's very likely that providing all the same security guarantees provided by synchronous messages isn't possible, e.g., signing and timestamping all the data processed by a Message Room.

Connected

The connected approach uses X-Road to establish a connection between the data exchange parties and implement the actual data exchange through an external channel outside of X-Road. In this case, X-Road provides a secure channel for the initial handshake that includes exchanging the details of the external channel.

The handshake is a regular X-Road message in which headers and/or body include the required details. The external channel doesn't directly connect with X-Road, so it may use any data exchange protocol. In this way, the approach is not coupled with any specific solution or technology, and it can be used to support multiple different solutions and use cases. For example, different messaging solutions like Apache Kafka, Rabbit MQ, and Apache ActiveMQ could all be supported. However, the external solution is managed independent of X-Road, and all configuration tasks must be done using the native interfaces and tools provided by the solution. Also, the X-Road security guarantees are provided for the handshake but not for the data exchange.

Standalone

Standalone means implementing Message Rooms as a standalone, fully independent solution that can be connected to X-Road. In this way, the solution can be used together with X-Road, but also without it. Connecting the solution to X-Road could be done using the integrated or connected approach.

An existing open-source messaging platform, such as Apache Kafka, could be taken as a basis and developed further if needed. No additional development is required in the best case, and an existing solution can be used as-is. Also, the solution could be implemented using an existing standard or a set of standards that would support interoperability on a broader scale.

Standardization

Standardization is about creating a protocol stack based on a standard or a set of standards that define all the aspects of the communications required by a Message Room. In practice, it means leveraging existing standards and creating new ones in case they are needed. This approach would make Message Rooms both technology and solution agnostic. The first step would be to study whether suitable standards that meet the requirements already exist and participate in their development. For example, the Message Room concept could be connected to data spaces, e.g., an asynchronous communication method within a data space and across different data spaces.

Requirements for the implementation

In theory, all the described implementation alternatives are technically feasible, but there are various differences when looking into them in more detail. Eventually, the choice between the alternatives boils down to the expectations, requirements, and constraints for the Message Room implementation.

One of the recognized requirements for the implementation is a reuse of existing infrastructure and X-Road principles. In practice, it means utilizing X-Road's existing security mechanisms, including authentication, identity management, message logging, signing, and timestamping. Also, the Message Room concept implementation should be based on a decentralized architecture, and therefore, it must not include any centralized components.

All the Message Room-related research and development activities conducted by NIIS have concentrated on categories 1-3 (built-in, integrated, connected) since they directly include X-Road and, therefore, provide the most logical starting point for the implementation. The different activities will be covered in more detail in my next blog post.

Synchronous or Asynchronous Messaging?

The previous blog post by Petri Kettunen provided insights on the Message Room use case study conducted by the University of Helsinki. This is the first part of a series of blog posts about the Message Room concept, and it provides an introduction to synchronous and asynchronous data exchange. The second part concentrates on life-event-based services and potential implementation alternatives. Three alternative implementation approaches are discussed in more detail in the third part.

In 2020, NIIS conducted an X-Road feature study to identify the needs and challenges of X-Road operators and members in Estonia, Finland, and Iceland. Information was gathered via 20 interviews (26 participants) and supplemented with an online questionnaire in Estonia (29 respondents). In addition, two innovation workshops were organized to collect further details on selected topics. 

Messaging patterns were one of the topics included in the study. Currently, X-Road only supports synchronous request-response messaging. Still, the need for asynchronous messaging had been recognized based on user feedback and some new use cases, such as proactive life event-based services. Therefore, additional information and insights on the actual needs and requirements were needed to make more concrete plans regarding the next steps.

Synchronous and asynchronous messaging

An excellent way to understand the difference between synchronous and asynchronous messaging is to compare a phone and an email. A phone call is synchronous – both parties must be available for it, and you have to wait for an answer when asking questions on the call. Instead, an email is asynchronous – you can send an email when you want, and the recipient is free to choose when to read it and respond.

Currently, X-Road is based on synchronous communication suited for real-time data and document exchange. Synchronous data is exchanged via request-response pairs. On a simplified level, a service consumer sends out a request and then waits for a service provider's response. When the request is sent, the service consumer waits for the response until it is received or a timeout occurs. Synchronous messaging creates a tight coupling between the data exchange parties since the consumer is always dependent on the availability and performance of the provider. Also, changes on either side easily break down the connection.

In the case of asynchronous messages, a service consumer sends a request and continues processing other tasks. The service provider sends a response later once it has processed the request or, depending on the type of asynchronous messaging pattern, it may not send a response at all. Asynchronous messaging creates only a loose coupling between the data exchange parties, making them less dependent on each other.

From message exchange patterns to communication patterns

In practice, there are several message exchange patterns how asynchronous messaging can be implemented. Asynchronous messaging can be one-way or two-way, and it can be based on different combinations of push and pull, for example:

  • send a message with no response (push)

  • send a message with no response (pull)

  • send a message with an asynchronous response (push and pull)

  • send a message with an asynchronous response (pull and push).

The asynchronous message exchange patterns and their different combinations are used in various communication patterns. The publish-subscribe and message queue communication patterns were included in the X-Road feature study together with the asynchronous request-response message exchange pattern.

Asynchronous request-response

A service consumer sends a request and continues processing with other tasks. The service provider sends a response later once it has processed the request. The consumer may include in the request a destination for the provider to send a message with the response.

Publish-subscribe

A service provider publishes messages (publisher), and any number of service consumers will receive them (subscribers). Service consumers that are interested in a publisher’s messages “subscribe” to a predefined channel that they know publishers will be sending messages to. When an event happens, a message is sent to all the service consumers who have subscribed to the channel. Many publishers and subscribers can use the channel, and each message is delivered to all the subscribers.

Message queue

A message queue is an asynchronous communication channel where service producers and consumers do not interact simultaneously. Service producers push messages onto a queue, and consumers read the messages from the queue. Only one consumer gets a particular message, no matter how many consumers read messages from that queue. Many producers and consumers can use the queue, but each message is processed only once by a single consumer.

What do the users truly desire?

According to the X-Road feature study results, the publish-subscribe pattern is the most preferred option for implementing life-event-based services. Also, according to the MoSCoW methodology, it’s a MUST together with asynchronous messaging. Instead, message queues and data streaming were mentioned in the study as well, and they were considered essential by many participants. According to the MoSCoW methodology, they were both SHOULD.

After the feature study, interactive innovation workshops were organized with participants from Estonia, Finland, and Iceland. According to the results, the publish-subscribe pattern got the highest priority in expanding X-Road's current message exchange capabilities. Also, the results strengthen the image of the pattern's role in implementing the proactive life event-based services. Therefore, it is logical to concentrate on the publish-subscribe communication pattern. However, it doesn't mean that the other alternatives are forgotten or mutually exclusive. In the long term, different options remain open even if the publish-subscribe pattern is prioritized first.

Also, the goal is not to replace the currently supported synchronous request-response pattern with asynchronous messaging. Instead, the aim is to extend X-Road’s messaging capabilities with asynchronous messaging.