7 December 2023

X-Road 7.4.0 Is Here

7 December 2023

The year 2023 is ending, and it's time to publish X-Road 7.4.0! The previous version, 7.3.0, was released in June, and it provided a new Central Server UI and management REST API. Version 7.4.0 includes several improvements for the Central Server, but the Security Server hasn’t been forgotten either.

The beta version is already out, and the official release version will be published in December 2023. The release notes are available here. Please note that the beta version doesn’t include all the changes mentioned in the release notes.

Let’s see what the highlights of the version 7.4.0 are!

Publishing global configuration over HTTPS

Until now, the Central Server and Configuration Proxy have only supported publishing global configuration over HTTP. Since the global configuration is signed, its integrity and authenticity are guaranteed despite lacking HTTPS support. Nevertheless, securing the global configuration download with HTTPS helps to maintain the privacy of the global configuration data.

A new private key and a self-signed TLS certificate are created automatically when installing a new Central Server or upgrading an existing installation from an older version. After the installation or upgrade, the Central Server Administrator must manually apply for a TLS certificate from a trusted Certificate Authority (CA) and then configure the certificate. This step is required because the Security Server does not trust the new automatically generated self-signed certificate by default. The Security Server supports disabling certificate verification, but disabling it in production environments is not recommended.

Starting from version 7.4.0, the global configuration will be available on the Central Server and Configuration Proxy ports 80 and 443. The old HTTP endpoint on port 80 guarantees that older Security Servers not supporting downloading global configuration over HTTPS continue to work normally. Also, the new Security Servers that support downloading global configuration over HTTPS will fall back to HTTP if the TLS certificate of the new HTTPS endpoint is not configured correctly.

Changes in the global configuration download ports may cause changes to firewall configuration on the Central Server, Configuration Proxy and Security Server. Inbound traffic from the Security Servers to port 443 must be allowed on the Central Server and Configuration Proxy. Similarly, outbound traffic to the Central Server and Configuration Proxy port 443 must be allowed on the Security Servers.

When upgrading the Central Server from version < 7.4.0 to version >= 7.4.0, the configuration anchor must be re-generated, distributed to all the Security Server administrators through an external channel (e.g., email) and imported to each Security Server to enable downloading global configuration over HTTPS. Without the new configuration anchor Security Servers will continue to download global configuration over HTTP.

Enforcing new key creation when generating CSR on the Security Server

A good security practice is creating a new key pair when applying for a new authentication or a sign certificate on the Security Server. Starting from version 7.4.0, the Security Server supports enforcing new key creation when generating a certificate sign request (CSR) on the Security Server. The feature is enabled by default, which means that it's not possible to generate a new CSR for a key that already has an existing certificate. The feature can be disabled by setting the “proxy-ui-api.allow-csr-for-key-with-certificate” system property to true on the Security Server.

Configuring a minimum required client Security Server version

From version 7.4.0, service providers can configure a minimum required X-Road software version for client Security Servers. It means that client Security Servers older than the configured version cannot access services anymore.

Service providers can configure the required minimum version using the “proxy.server-min-supported-client-version” system property. The property has no value by default, meaning a minimum version hasn't been set. Instead, when the value is set, all the minor and patch versions starting from the configured version are approved.

X-Road Operators may set this value in the country-specific meta packages, e.g., Estonia and Finland are setting the value to the three latest versions, which means that requests that are originating from client Security Servers older than version 7.2.0 are not accepted anymore in the Estonian and Finnish X-Road ecosystems. The value is rolling, meaning it changes for every X-Road minor version, e.g., when version 7.5.0 is released, the minimum supported version is 7.3.0.

Changes in default client information system communications ports

The default client information system inbound communication ports are changed from 80 to 8080 and from 443 to 8443 on Ubuntu-based Security Servers. In this way, the default ports are consistent regardless of the hosting operating system.

The change is only applied to new installations, and existing ones are unaffected by it. Also, if custom ports have been configured locally by the Security Server administrator, they are unaffected.

X-Road Operators may set the default port values in the country-specific meta packages. The Estonian ecosystem uses ports 80 and 443 by default for new installations after the change. In other words, the Estonian X-Road users organisations are unaffected by the change.

LDAP group mapping support

The Central Server and Security Server administrator user permissions are managed using X-Road-specific Linux groups mapped to X-Road user roles. An administrator must be a member of the X-Road-specific groups to access the management UI.

Starting from version 7.4.0, it’s possible to map additional Linux groups to X-Road user roles using the “proxy-ui-api.complementary-user-role-mappings” and “admin-service.complementary-user-role-mappings” system properties. It enables granting administrators permissions to the management UI using existing Linux groups instead of adding administrators to the X-Road-specific groups. In this way, existing Linux groups can be mapped to the X-Road user roles, and there's no need to change users' groups.

Rotating global configuration sign keys

The Central Server global configuration sign keys are included in the configuration anchor imported to the Security Server. Using the information in the anchor, the Security Server can verify the global configuration signature. Suppose the sign keys are rotated. In that case, a new configuration anchor must be generated on the Central Server, distributed to all the Security Server administrators through an external channel (e.g., email) and imported to each Security Server. Only after that, the changes are applied by each Security Server.

Starting from version 7.4.0, the Central Server global configuration sign keys are included in the global configuration. Thanks to this, rotating the sign keys is possible without importing a new version of the configuration anchor to each Security Server. Instead, the Security Server gets the updated configuration information from the global configuration directly and applies it immediately.

Importing the configuration anchor to the Security Server manually is also possible. It's required when the Security Server is initialised for the first time and when a Security Server has been offline or turned off when the global configuration sign key has been rotated.

Disabling a subsystem temporarily

Disabling a subsystem temporarily enables the Security Server administrator to disable a subsystem so that all the services added under the subsystem cannot receive requests. If the same subsystem is registered on multiple Security Servers, disabling the subsystem on one Security Server doesn’t affect the others. When a subsystem is disabled, the association between the subsystem and the Security Server where the subsystem is disabled is not visible to other Security Servers in the ecosystem. This applies to the local and federated ecosystems.

Instead, disabling a subsystem doesn't change any configuration items regarding the subsystem or its services. It makes the association between the Security Server and subsystem invisible to other Security Servers so that they cannot send requests to it. Similarly, a disabled subsystem cannot be used as a service client either.

Disabling and enabling a subsystem has a delay of a couple of minutes when the default configuration is used. During the delay, the subsystem may still receive service requests. However, the requests are not forwarded to the service provider information systems by the Security Server. Instead, all received requests will fail and an error message is returned to the service consumer. This is because the subsystem state is maintained on the Central Server, and updating it requires the Security Server to communicate with the Central Server. The length of the delay depends on the global configuration generation interval on the Central Server and the global configuration download interval on the Security Server. Other Security Servers become aware of a change in a subsystem’s state only after they download an updated version of the global configuration. Therefore, they may try to contact services under a disabled subsystem during the delay period.

Changing the Security Server address

The Security Server address is added to the Central Server when the authentication certificate is registered. Before version 7.4.0, updating the address afterwards required the Security Server administrator to contact the Central Server administrator over some external channel, e.g., email. Instead, starting from version 7.4.0, updating the address using the Security Server administrator UI or management REST API is possible.

The Security Server administrator can update the Security Server address by using the Security Server administrator UI and management REST API. The change is approved automatically by the Central Server and doesn't require manual approval from the Central Server administrator. By default, the change becomes visible to all the Security Servers in the ecosystem within a couple of minutes. However, the length of the delay depends on the global configuration generation interval on the Central Server and the global configuration download interval on the Security Server.

Migrate from Akka to gRPC

In September 2022, Lightbend announced that Akka, one of the third-party open-source libraries that X-Road heavily depends on, would change its licensing model. The new licensing model is commercial, but open-source projects meeting specific criteria may apply for an exception. However, the conditions of the license granted to open-source projects differ from the Apache 2.0 open-source license that Akka used before the change. From X-Road's perspective, the new license includes some unfavourable conditions, so Akka has been replaced with gRPC in X-Road.

Because of the change, any Akka-related system properties are not supported anymore starting from version 7.4.0, e.g., “<component>.akka.remote.artery.advanced.maximum-frame-size”. If Akka-related system properties have been used to override Akka’s default configuration values, they should be manually removed from the configuration.

Technical changes and upgrades

Besides functional changes and new features, several more technical changes have also taken place under the hood. The UI framework has been upgraded from Vue 2 to Vue 3, and the Spring Boot backend framework has been upgraded from version 2 to 3.

Also, the supported Java version has been upgraded from Java 11 to Java 17. The new Java version is installed automatically on other supported platforms except Red Hat Enterprise Linux 7 (RHEL7). On RHEL7, the Security Server administrator must manually configure a new package repository. More information about the upgrade process will be available in the release notes.

Other improvements and updates

Besides the already-mentioned features, version 7.4.0 includes many other minor improvements and updates. For example, a new command line tool to verify message log archive hash chains, remove deprecated rate limiting parameters on the Security Server, improve the Central Server UI, publish REST services using OpenAPI 3.1, replace the SHA-1 hashing algorithm with SHA-256, and much more. To fully understand all the changes, please review the release notes document.

What’s next?

Initially, support for automated certificate management through the Automatic Certificate Management Environment (ACME) protocol was scheduled for version 7.4.0. Unfortunately, support for ACME didn't make it to version 7.4.0, but it will be included in version 7.5.0 instead. Other changes that will be introduced in version 7.5.0 include but are not limited to, support for Ubuntu 24.04 LTS and support for Red Hat Enterprise Linux (RHEL) 9. Version 7.5.0 will be released during Q2 in 2024.

At the same time, when developing X-Road 7, NIIS is already working on the next X-Road major version – X-Road 8. If you’re interested in hearing more about the topic, please check out the talk that I delivered at the X-Road Community Event 2023 in September. More news regarding X-Road 8 will be published soon. Stay tuned!

Petteri Kivimäki

28 June 2023

Is X-Road a Data Space Technology?

Petteri Kivimäki

28 June 2023

Data spaces are a hot topic right now in the field of data exchange and interoperability. However, it's not a new concept since its roots date back to 2005. The original definition was somewhat technical, but over the years, the term has expanded to cover all four layers of interoperability defined by the European Interoperability Framework (EIF).

Today, the term data space has several different definitions depending on whom you ask. Multiple international initiatives are working on data spaces to create common governance models, specifications, reference implementations, etc. Let's look at the definitions created by some of those initiatives.

”An infrastructure that enables data transactions between different data ecosystem parties based on the governance framework of that data space. Data space should be generic enough to support the implementation of multiple use cases.”

Source: Data Spaces Support Centre - DSSC Glossary - Version 1.0

”Decentralised, governed and standard-based structure to enable trustworthy data sharing between the data space participants on a voluntary basis. Data spaces may be purpose- or sector-specific, or cross-sectoral. Common European data spaces are a subset of data spaces within the scope of EU policies.”

Source: Gaia-X Glossary

”A Dataspace is a set of technical services that facilitate interoperable asset sharing between entities.”

Source: IDS Dataspace Protocol v0.8

”A data space is a federated data ecosystem within a certain application domain and based on shared policies and rules.”

Source: Sitra Rulebook for a Fair Data Economy

”A common European data space brings together relevant data infrastructures and governance frameworks in order to facilitate data pooling and sharing.”

Source: Commission Staff Working Document on Common European Data Spaces

When looking at the definitions, it’s clear that they are defining the same subject even though they’re approaching it from different angles. Also, it’s evident that data spaces are more than a technology since governance and policies play a significant role in the definitions.

The first version of X-Road was released in Estonia in 2001 – four years before the term data space appeared in an article for the first time. The technical implementation of X-Road has evolved and changed over the years, but the main concept has remained the same since the beginning. Also, X-Road is not only about technology; it also has an organisational model and trust framework. When comparing X-Road to the data space definitions, it’s easy to find many common factors and similarities. In fact, the definitions could very well be used to describe X-Road on a high level. But does that make X-Road a data space technology?

To answer the question, digging deeper into data spaces is necessary. Let's look at more detailed documentation and specifications to see how well X-Road is aligned with them.

The International Data Spaces Association

The International Data Spaces Association (IDSA) is a not-for-profit organisation aiming to create a global standard for international data spaces (IDS) and foster the global data spaces community. IDSA provides several data spaces related assets, for example:

IDS Rulebook
IDS Reference Architecture Model (IDS-RAM)
IDS G
IDS Dataspace Protocol.

Let’s study some of the assets provided by the IDSA and see how well X-Road is aligned with them. However, this will not be a complete comparison between the IDSA assets and X-Road. Instead, only selected parts of specific assets will be reviewed to highlight similarities and differences. Studying the source materials is recommended if you're interested in the topic in more detail.

Data space fundamentals

The IDS Rulebook defines foundational concepts of data space that seem to apply to X-Road on a high level:

Establishing trust
Data discoverability
Data contract negotiation
Data sharing & usage
Observability
Vocabularies and semantic models.

The IDS Rulebook and X-Road have the same vision of enabling data sovereignty and creating a trusted data-sharing ecosystem. They’re both based on the idea that trust is rooted in one or more trust anchors, trusted participants must meet specific policies that can vary between different ecosystems, and data sharing consists of peer-to-peer interactions between the participants.

Policies play an essential role in any data ecosystem. The IDS Rulebook defines multiple policy groups and the relationships between them. The same policy groups apply to X-Road, too, with the difference that X-Road doesn't define the structure and relationship of the policies or offer tools to express them. For example, the IDS specifications cover defining usage and contract policies and negotiating them as a part of the data exchange process. In X-Road, it’s up to the data exchange parties to agree on those policies using some external channel, e.g., email or service catalogue. Also, X-Road supports managing access to services. Still, the access policies are not technically associated with any other policies since the other policies are created and maintained outside of X-Road.

Operating models

The IDS Rulebook defines three different operating models for data spaces:

Centralised data space authority
Federated / distributed data space authority
Decentralised data space authority.

In the context of X-Road, the data space authority is equal to the X-Road Governing Authority. X-Road supports centralised and federated authority. Centralised authority is the most typical alternative for X-Road, and it's the model that a single X-Road ecosystem uses. Instead, when two X-Road ecosystems are federated, the responsibility can be considered distributed between the authorities of the federated ecosystems. However, even in a federated setup, the authorities only have control over their ecosystem.

Conceptual model

The conceptual model defined by the IDS Dataspace Protocol is very similar to X-Road’s conceptual model. Doing a mapping between the models is straightforward:

Dataspace – X-Road Ecosystem
Dataspace Authority – X-Road Governing Authority
Participant – Member
Participant Agent – Security Server
Identity Provider – Certificate Authority
Credential Issuer – N/A
Dataspace Registry – Central Server, Service Catalogue

The descriptions of the organisational entities (Dataspace Authority, Participant, Identity Provider) are pretty well aligned. Instead, there are differences in the descriptions of the technical entities (Participant Agent, Dataspace Registry). The technical differences are covered in the Components section.

Components

Comparing the components provides a good overview of the similarities and differences between the IDS-RAM and X-Road from a technical perspective. The IDS Identity Provider is not actively involved in the data exchange process, so it's omitted from the illustration.

IDS Identity Provider

The IDS Identity Provider consists of three complementary components:

Certificate Authorities (CAs) issue and manage technical identity claims.
The Dynamic Attribute Provisioning Service (DAPS) issues short-lived tokens providing information about Connectors.
The Participant Information Service (ParIS) provides business-related details on Participants.

The IDS Identity Provider combines X-Road's Certificate Authority and Central Server. In X-Road, the roles of the CA and DAPS are combined since Member and Security Server-related information is included in the certificates. At the same time, the IDS Identity Provider decouples the certificate and the associated information. In that way, it's possible to alter the attributes of a certificate, e.g., attach new attributes to a certificate, without reissuing the certificate.

In X-Road, the Central Server stores some business-related information about Members. In addition, the Service Catalogue is a complementary X-Road component that may provide additional information about Members. Therefore, the Central Server and Service Catalogue are equal to the ParIS.

IDS Metadata Broker

The IDS Metadata Broker is basically a registry of IDS Connectors, their capabilities, characteristics, and metadata of the data they offer. The IDS Metadata Broker provides endpoints for registering, publishing, maintaining, and querying the metadata. The X-Road Central Server has a similar role as a registry of Members and Security Servers, with the difference that the Central Server doesn't store any information about the data or services offered by Security Servers. Instead, the Service Catalogue is an optional X-Road component that provides information about the data and services.

IDS Connector

The IDS Connector is the point of access to a data space, providing standardised data exchange between Participants. The IDS Connector is connected to all the other data space components since it provides data and metadata to them. For example, the IDS Connector provides technical interface description, authentication mechanism, and associated data usage policies to the Metadata Broker and usage contracts to the Clearing House. Also, several other IDS components (App Sore, Metadata Broker, and Clearing House) are based on the IDS Connector architecture.

In X-Road, the component that provides access to an X-Road ecosystem is the Security Server. Like the IDS Connector, the Security Server provides secure and standardised data exchange between Members. The Security Server communicates with all the other components of an X-Road instance – either directly or indirectly.

IDS Clearing House

The IDS Clearing House provides a logging service that records relevant information for clearing, billing and usage control. For example, the information is used to provide a settlement service based on usage contracts which can help to automate payments between the data exchange parties. In addition, the logged usage control data can be used to validate access to resources. Technically, the IDS Clearing House consists of an IDS Connector.

X-Road doesn’t have a component that would be a direct match with the IDS Clearing House. However, the optional X-Road Metrics component provides some similar capabilities, e.g., collecting service usage statistics that can be used to automate payments, collecting service statistics. In addition, the Security Server provides a digitally signed and timestamped log record of each data exchange transaction that can be used to validate the transaction afterwards. However, the data exchange parties store the log records locally, and no third parties have access to them.

IDS App Store

The IDS App Store is a platform to distribute IDS Apps for IDS Connectors. Currently, X-Road doesn’t have a similar component.

IDS Vocabulary Hub

The IDS Vocabulary Hub is a vocabulary management platform for IDS use cases. It’s a platform to host, maintain, publish, and document vocabularies. It gives access to the vocabulary terms and their descriptions. The IDS Information Model is the lowest common denominator and can be extended with additional vocabularies. The IDS Vocabulary Hub communicates with IDS Connectors and infrastructure components.

The X-Road architecture doesn’t include a vocabulary component, but it goes without saying that interoperable data exchange requires semantic interoperability. Therefore, it’s up to the X-Road Governing Authority and data exchange parties to agree on where and how vocabularies are managed and stored. For example, the Finnish Digital Agency has developed an Interoperability Platform that can be used to maintain and publish common terminologies.

Conclusions

After comparing data spaces and X-Road from various aspects, they have a lot in common on a high level. However, looking into the details shows that they also have many differences.

For example, the IDS specifications cover various aspects of data exchange, e.g., usage conditions and policies, contract negotiation, and data transfer process. They do not set restrictions to the data transfer protocol or wire protocol – the protocol used to transfer the data. In other words, any existing transfer protocol can be used, which enables using existing solutions and supporting many different transfer protocols. This is a big difference compared to X-Road, which has its own transfer protocol. Also, X-Road doesn’t technically cover the before-mentioned aspects (usage conditions and policies, contract negotiation) covered by the IDS specifications.

Let's return to the original question – can X-Road be considered a data space technology? The answer is no if X-Road is strictly compared to the IDS specifications, despite the similarities. The IDSA has a certification scheme for the IDS components, and X-Road would not meet the certification requirements. The IDS Testbed is a platform that can be used to conduct evaluations that ensure that an IDS component is implemented securely and conform to the relevant specifications. Components that pass the evaluation receive certification and are approved in the IDS ecosystems.

On the other hand, the European Commission considers data exchange ecosystems to be data spaces even if they are not based on technologies conforming to the IDS specifications, for example, the Once-Only Technical System (OOTS) and European Health Data Space (EHDS). Those ecosystems align with the common definition and characteristics for data spaces, but the technologies they use in the implementation do not conform to the IDS specifications. With the same logic, an X-Road ecosystem could also be considered a data space.

Multiple organisations and initiatives are developing specifications and standards for data spaces, e.g., International Data Spaces Association (IDSA), Gaia-X, and Fiware. The initiatives work together, aiming to keep the specifications and standards aligned and not create multiple overlapping or competing specifications. However, the work is currently in progress, and not everything is fully ready and aligned yet. Therefore, it might be too early to define what is a data space and what’s not based on the deliverables of a single actor. Nevertheless, in the long-run, common standards and specifications are the only way to achieve interoperability within and between data spaces.

Petteri Kivimäki

20 June 2023

New X-Road® Central Server UI and management REST API are here!

Petteri Kivimäki

20 June 2023

In recent years, the Security Server has experienced a total external and internal makeover. The process got started in 2019 when support for REST services was added. In 2020, the Security Server got a new user interface (UI) and a management REST API that enabled the automation of common configuration and maintenance tasks. Releasing X-Road 7 in 2021 enhanced the UI's look and feel and brought several other significant changes and improvements under the hood. While all these major changes have been implemented for the Security Server, the Central Server has received only some smaller updates. However, the Central Server has been remembered and will be the star of X-Road 7.3.0.

The beta version of X-Road 7.3.0 is already out, and the official release version will be published at the end of June 2023.

Easier administration and streamlined onboarding process

The most significant change in X-Road version 7.3.0 is the fully renewed Central Server UI. The new UI improves the usability and user experience of the Central Server. The new intuitive UI makes regular administrative tasks easier and supports streamlining the onboarding process of new X-Road members.

For example, complementary management requests for authentication certificates and client registration requests are no longer required. It's enough to send a registration request from the Security Server and approve it with two clicks on the Central Server. And like before, enabling automatic approval of registration requests makes the approval process fully automated.

*Image 1. List pending management requests.*

Management REST API allows to automate tasks

Another significant change in X-Road version 7.3.0 is the brand-new Central Server management REST API. The API provides all the same functionalities as the UI and can be used to automate common maintenance and management tasks. Maintaining and operating the Central Server can be done more efficiently as configuration and maintenance tasks require less manual work. Also, the new UI uses the same API under the hood too.

The Central Server User Guide provides more information about the API, and the API's OpenAPI 3 description is available on GitHub. Access to the API is controlled using API keys that can be managed through the Central Server UI or through the API itself. In addition, access to the API can be restricted using IP filtering.

Changes in the architecture

The new UI and management REST API have also caused changes in the Central Server architecture and packaging. The previously existing Jetty (xroad-jetty) component has been replaced with the new UI and API (xroad-center), registration service (xroad-center-registration-service) and management service (xroad-center-management-service) components. These changes have affected Central Server’s log files, directories, software packages, and services. It’s strongly recommended that Central Server administrators study the details of these changes from the release notes before upgrading to version 7.3.0.

*Image 4. Changes in the Central Server architecture - before version 7.3.0 (left) and starting from version 7.3.0 (right).*

Wait, there’s more!

Even though the new Central Server UI and management REST API are the most significant and most visible changes in version 7.3.0, the new version contains many other new features, improvements, and fixes. Here’s a short overview of other changes included in the latest version.

Security improvements on the Central Server:
- Encrypt backup files (opt-in)
- Verify the integrity of backup files on restore.
Run all the X-Road components on Java 11. Remove support for Java 8.
Create a separate security hardening guide that provides information about hardening the Central Server and Security Server host configurations.
Implement configurable request rate and size limits for the Central Server REST API and management services.
Changes in allowed characters in X-Road system identifiers and improved validation of the identifiers.
Technology updates and a decrease in technical debt.

The complete list of changes with more detailed descriptions is available in the release notes.

What’s next?

Implementing the new Central Server was a long process that required more time and effort than was initially expected. Unfortunately, it has caused postponing the implementation of some other new features. More changes to the Central Server are scheduled in the upcoming X-Road versions, but the focus will now shift to other roadmap items.

More information about the X-Road development roadmap is available here. More detailed information about the backlog items scheduled for version 7.4.0 is available here.

Third-party security experts have assessed the security of the new Central Server. However, should you have any findings, they can be reported through the newly launched X-Road Bug Bounty program.

Andrius Matšenas

20 March 2023

Unravelling the Complexities of National Data Exchange Networks: A Network Science Approach

Andrius Matšenas

20 March 2023

Introduction

This post is based on the findings from my research project titled "Graph Analysis of Dynamic National Data Exchange Networks."

In the age of relentless digital connectivity, understanding complex networks has become increasingly critical, spanning from social media platforms to the emerging world of blockchain technologies. X-Road, an established data exchange infrastructure, has been embraced by countries such as Estonia, Finland, Iceland, Colombia, Argentina, and Vietnam. Catering to millions of individuals, X-Road can be viewed as a complex network where government bodies, companies, non-profits, and various other organisations exchange data with one another.

In this post, we shall explore the intricacies of national data exchange networks through the lens of network science. By investigating the Estonian X-Road network (X-tee), I aimed to better understand the underlying patterns within data exchange networks and pinpoint potential areas for enhancement. Estonia has been collecting the network's transaction data (through the X-Road Metrics component, an open-source extension to X-Road) since 2016. The anonymised open data serves as a valuable starting point, and the insights derived could potentially be applicable to other nations as well.

Key Findings: A Network Science Approach

By analysing over 30 million data queries on the Estonian X-Road network, several key insights were obtained using network science analysis methods. The network shares common attributes with other real-world networks:

Sparsity: an overall low number of connections compared to the maximum possible connections among its members.
Central giant component: a dominant connected subgraph in which a large fraction of the network's nodes/members are interconnected.
Power law distribution for parts of the network: revealing a small number of highly connected nodes and a large number of less connected ones.

These identified characteristics suggest that the network is well-suited for further modelling and analysis using network science methodologies.

Some of the key findings from the analysis:

Public sector organisations, particularly governmental institutions, form the backbone of the data exchange infrastructure, being the most connected and active members of the network
Nighttime is the prime time for mass data queries from government organisations on people and companies, particularly for tax authorities and bankruptcy bailiffs. During the daytime, service sectors like healthcare flourish, with the Health Insurance Fund and hospitals among the most active X-tee members.
The network's most active members could be grouped into five distinct communities:

Healthcare
IT and Infrastructure
Social Security and Taxes
Internal Affairs and Transport
Education, Defence, and Environment.

*Figure 1. 50 most active member clustered into 5 communities.* *See the full size image*.

Though the community groupings may not be flawless, it's crucial to emphasise that these communities were identified solely by analysing query volumes between network members throughout the day. The content of the data queries, which is not publicly available, was not factored into the community detection process. This implies meaningful relationships between network members and groups of members could be discovered even without contextual information.

Implications and Future Directions

The findings from this analysis project have several implications.

First, the research demonstrates the value of network science in modelling and analysing data exchange networks. This paves the way for more advanced prediction models and real-time monitoring tools. By discerning interaction patterns and activity distribution, decision-makers can enhance system performance, addressing both cybersecurity and economic concerns.

Second, the research highlights the importance of the public sector in driving data exchange, as well as the diverse range of services that rely on these networks. Understanding these interactions could help policymakers optimise resource allocation and improve the overall functioning of public services.

Lastly, the ability to accurately identify communities within the network suggests that further insights can be gained by examining the data transaction flows between these groups. This could potentially lead to a better understanding of the relationships between different sectors and the dynamics of the data economy.

Limitations and Challenges

While the findings of this research project provide valuable insights into the intricacies of national data exchange networks, it is essential to acknowledge some limitations that could impact the conclusions drawn from the analysis.

Lack of contextual information: The reliance on transaction data, without the content of the data queries, limits the depth of understanding of the relationships between network members. Including contextual information could provide a more comprehensive view of how different sectors interact within the network.
Generalizability: The analysis is based on the Estonian X-Road network, and the findings may not be directly applicable to other countries or networks with distinct characteristics or data exchange practices.
Possible biases: The data or methodology used in the analysis may introduce biases that could affect the outcomes and conclusions. Further investigation may be required to identify and address these biases to ensure the reliability and validity of the findings.
Dynamic nature of data exchange networks: As networks evolve over time, the findings from this research may be impacted by changes in the network structure or the interactions between members. Periodic re-analysis or real-time monitoring would be needed to maintain an accurate understanding of the network dynamics.
Need for further research: The findings presented in this blog post warrant additional investigation to validate or expand on the conclusions. Future research could explore the impact of incorporating contextual information, compare data exchange networks across countries, or investigate the relationships between different sectors and the dynamics of the data economy more thoroughly.

By acknowledging and addressing these limitations, the research can be further refined, and the understanding of national data exchange networks can be deepened, ultimately contributing to more effective decision-making and policy development.

Conclusion

In conclusion, this research project demonstrated the power of network science in shedding light on the complex world of national data exchange networks. As an increasing number of countries adopt data exchange solutions like X-Road, understanding their intricacies will be crucial in improving decision-making, reducing bureaucracy, and enhancing the overall happiness of citizens. The methodologies and insights derived from this project could serve as a valuable foundation for future work in this domain and may also encourage more countries and municipalities to adopt secure data exchange layers, ultimately benefiting millions of people around the world.

Andrius Matšenas, a recent Mathematics graduate from the University of Southampton, has a strong interest in network science, which he delved into in his BSc thesis – the basis for this blog post. With a passion for designing software products, Andrius co-founded Stardust Network, where he led a team to develop apps that empower users to take control of their personal data. He also gained valuable product development experience as a Product Analyst at NFTPort. Find out more: matsenas.ee

Petteri Kivimäki

14 February 2023

Database, dataset, data service, or service? Getting to know X-Road services.

Petteri Kivimäki

14 February 2023

Services are essential building materials of any data exchange ecosystem, and X-Road is no exception. Therefore, the number of available services is one of the key metrics when measuring the effects and benefits of an X-Road ecosystem.

For clarity, in X-Road, services are technical interfaces between information systems that are not used by end-users directly. Instead, end-users communicate with X-Road services indirectly through other systems and platforms, e.g., using the state portal that fetches information from multiple base registries over X-Road.

Besides the number of services, other important metrics are the number of member organisations and connected information systems, and the amount of data and queries exchanged between the members. However, from the ecosystem perspective, the number of available services may be the most essential metric. According to a study by Kristjan Vassil in 2016, the discrete threshold for ecosystem growth appears at 50 data repositories.

X-Road comes with tools to measure the number of available services in an ecosystem. Since X-Road is based on a decentralized architecture, the X-Road member organizations maintain information about available services locally on their Security Servers. There's no centralized list of available services by default, but the X-Road Operator may collect the data from Security Servers and publish it in a service catalog. For this purpose, the Security Server provides a set of built-in metadata services that can query metadata about the services published by different member organisations. However, interpreting the numbers requires some background knowledge of what the term service means in the context of X-Road. This blog post aims to explain the anatomy of service in X-Road.

Database, dataset, data service, or service?

Over the years, many different terms have been used to describe information systems in service providers' roles in X-Road. The term database was the most used for many years, while the term service has become the most common in recent years. Nevertheless, different terms are still used interchangeably.

Technically, X-Road doesn't distinguish between a database, data set, data service, or service. Also, there's no difference between a data service, a business service, or an aggregated service. Instead, X-Road divides services into SOAP, OpenAPI, and REST. In other words, dividing services into different categories is based on their technical characteristics rather than the type of the service.

Identifying services

In X-Road, all services are identified using a unique service identifier string. The identifier is used to invoke services, and the number of available services in an X-Road ecosystem is counted by calculating the number of service identifiers. The service identifier includes information about the X-Road ecosystem, the organization owning the service, and the information system providing the service. However, the service identifier doesn't contain any information about the service category, but the built-in metadata services can be used to access the category information.

Connecting services

Publishing a service to X-Road requires that the organization owning the service has been registered as a member of an X-Road ecosystem and has access to a Security Server. First, the organization must complete an onboarding process to join an X-Road ecosystem. The organisation’s identity is verified by a trusted Certificate Authority (CA), and the X-Road Operator registers the organization. As a result, the organization is given an X-Road organization identifier.

Security Server is the organisation’s technical access point to the X-Road ecosystem. The organization may deploy its own Security Server, use a shared Security Server, or buy a Security Server as a service from a commercial service provider.

When an X-Road member organization has access to a Security Server, an information system providing a service must be connected to the X-Road ecosystem. In X-Road's terms, it means registering a new subsystem identifier on the Security Server(s) or utilizing an existing subsystem. A subsystem represents an information system or a logical group of information systems. The information system may be in a service consumer role, a service provider role, or both.

The service is then added under the subsystem. If the information system provides multiple services, all the services can be added under the same subsystem. Alternatively, various services published by the same information system can also be added under different subsystems. Access permissions to the services are defined on a service level, so whether the same or different subsystem is used doesn't affect them.

A service can be published on one or more Security Servers simultaneously, and one Security Server can publish multiple services owned by different organisations. For high availability, publishing services on multiple Security Servers is recommended. The Security Server supports high availability in two different ways: internal and external load balancing. The number of Security Servers where a service is published doesn’t affect the service identifier – it’s always the same.

On the other hand, it's also possible to publish the same service under two different service identifiers. For example, a free version of a service is published on a standalone Security Server with no high availability, and a paid version of the same service is published with a different service identifier on a Security Server cluster with external load balancing. That way, it's possible to provide two versions of the same service with different SLAs.

SOAP, OpenAPI, and REST services

X-Road supports three service categories: SOAP, OpenAPI, and REST. Regardless of the category, all the services are identified using a service identifier. However, the way how the services work and are managed varies between different categories.

SOAP

X-Road Message Protocol for SOAP defines how service consumers and service producers communicate with the Security Server. The protocol is based on SOAP profile 1.1. It comes with some X-Road-specific limitations and additional requirements, e.g., support for synchronous request-response operations only, some mandatory SOAP headers are required, and document/literal style SOAP body is required.

A common approach is to have an additional adapter service component between the Security Server and a SOAP client or service. The adapter service implements the X-Road Message Protocol for SOAP and converts all incoming/outgoing messages to/from the X-Road SOAP profile.

SOAP services are connected to the Security Server by providing a URL of a WSDL service description that may contain one or more SOAP service endpoints. The Security Server validates the structure of the WSDL description, but it doesn’t validate the WSDL against the service endpoint implementation.

Typically, a single SOAP endpoint represents a service that implements one action or procedure. Each endpoint has a unique service identifier. Also, endpoints can have multiple versions that have their own identifiers. For example, a WSDL description with four SOAP endpoints counts for four X-Road services since each endpoint gets its own X-Road service identifier.

Access permissions to SOAP services are managed on the service endpoint level. When a single WSDL contains multiple SOAP endpoints, access rights must be defined for each endpoint separately.

OpenAPI

Consuming and producing OpenAPI and REST services via X-Road is possible without an additional adapter service component. X-Road-specific information required by the Security Server (e.g., service client identifier, service provider identifier, message id, etc.) is transferred and processed so that existing REST-style services and service consumers can be connected to X-Road with minimal changes or no changes at all. This is achieved by transferring X-Road-specific information required by the Security Server in HTTP headers and URL parameters outside the message payload. The full details are available in the X-Road Message Protocol for REST document.

OpenAPI services are REST APIs that have an OpenAPI Specification available. The Security Server doesn't set any restrictions to the content type of the API messages, so the content type isn't limited to JSON only.

OpenAPI services are connected to the Security Server by providing a URL of an OpenAPI specification that describes a REST API with one or more API endpoints. The Security Server validates the structure of the OpenAPI specification, but it doesn’t validate the specification against the API implementation.

Typically, REST APIs are resource-centric, and endpoints are used to change the state of resources. However, REST APIs may also be RPC-style and implement actions or procedures. Either way, an OpenAPI service has a unique service identifier that covers all its API endpoints. For example, an OpenAPI service with four API endpoints counts for one X-Road service identifier.

A single OpenAPI service may support multiple API versions, or different API versions may be published as separate OpenAPI services. If the API version is included in the API base path URL (e.g., https://api.example.com/v1), a new OpenAPI service must be created for a new API version. Instead, if the API version isn't included in the base path URL (e.g., https://api.example.com), the same OpenAPI service can access different API versions. Access to the API works so that all the paths under the base path URL are accessible to service clients with sufficient access permissions. Therefore, the base path URL must be defined with caution.

Access permissions to OpenAPI services can be managed on the API and API endpoint levels. Giving access on the API level means providing access to all the API endpoints by default. Also, if the API has endpoints not defined in the OpenAPI specification, they can be accessed too. Instead, giving access on the API endpoint level only provides access to specific endpoints. API endpoint level access permissions are defined using HTTP request method and path combination. Therefore, it is possible to define access rights for a single endpoint or alternatively for a subset of endpoints using wildcards. By default, the Security Server has a list of all the endpoints defined in the OpenAPI specification, but adding new endpoints manually is supported. Security Server’s access rights management only supports allowing access – explicitly denying access is not supported, e.g., allowing access to all endpoints on the API level and denying access to a single endpoint is not supported.

Besides access rights management, the Security Server does not use the endpoint-related information for anything else, e.g., the Security Server does not validate if an endpoint defined in a request by a client information system exists under an API or not. In other words, if a client information system has sufficient access rights to invoke an API endpoint, the Security Server forwards the request to the specified endpoint without any further validations.

REST

REST services are REST APIs that don’t have an OpenAPI specification available. REST services are connected to the Security Server by providing the API's base path URL. A REST service has a unique service identifier that covers all its API endpoints. For example, a REST API with four API endpoints counts for one X-Road service identifier.

REST services can be used to connect any HTTP-based endpoints to the Security Server. The Security Server doesn't set any restrictions to the content type of the API messages, so the content type isn't limited to JSON only. For example, a group of SOAP services could be connected to the Security Server using a REST service. It would be enough to provide the base URL of the SOAP services without a WSDL service description. In that case, the group of SOAP services would count for one X-Road service identifier.

Access permissions to REST services can be managed on the API and API endpoint levels. Giving access on the API level means providing access to all the API endpoints. Instead, giving access on the API endpoint level only allows access to specific endpoints. Since REST services don't have OpenAPI specification that defines the API endpoints, the endpoints must be added manually by the Security Server administrator if they need to be used in access rights management.

Metadata services

The Security Server provides a set of built-in methods that can be used to discover what services are available to them and download the machine-readable service descriptions. These methods are known as metadata services and are accessed using the service metadata protocol for SOAP and the service metadata protocol for REST. The metadata services have separate versions for SOAP and REST services.

Counting members, information systems, and services

The number of registered member organisations can easily be counted based on the member identifiers. Instead, the number of connected information systems can be calculated by the subsystem identifiers. However, the number of subsystem identifiers doesn't directly tell the number of connected information systems since a single information system may have multiple subsystems, and various information systems can share the same subsystem. Therefore, the number of connected information systems is only indicative.

With services, things get more complicated. The number of service identifiers doesn’t directly match the number of connected services. SOAP service endpoints are counted separately, while OpenAPI services are calculated on the API level rather than the API endpoint level. The same applies to REST services as well. However, X-Road supports counting the number of individual API endpoints, too, if they have been defined in an OpenAPI specification or manually.

Nevertheless, the service-related numbers don't say anything about the type of services. For example, they may provide simple access to data, execute a business process, provide service orchestration, etc. Therefore, additional analysis going behind the numbers is needed when comparing the available services of two X-Road ecosystems or evaluating the service coverage of a single ecosystem.

In addition to these metrics, it’s highly recommended the X-Road Operators implement the X-Road Metrics extension to get more detailed insights on the data exchange-related details in the X-Road ecosystem.