25 June 2018

Balancing the Load in X-Road

25 June 2018

In general, load balancing means distributing workloads across multiple computing resources. Instead of relying a single resource the same service is deployed on multiple resources and service requests are distributed across all of them. If one of the resources stops responding, no more requests are routed to it and other available resources take care of serving the requests. Load balancing is used to increase performance and availability using multiple components instead of a single one. Load balancing can be implemented in different ways – a load balancer can be software or hardware based, DNS based or a combination of the previous alternatives. In addition, load balancing can be implemented on client or server side.

X-Road Security Server has an internal client-side load balancer and it also supports external load balancing. The client-side load balancer is a built-in feature and it provides high availability. Support for external load balancing has been available since version 6.16.0 and it provides both high availability and scalability from performance point of view.

Internal load balancing in X-Road

The internal client-side load balancer is a built-in feature of the X-Road Security Server and it is operating on “Fastest Wins” basis. When a service is registered on multiple Security Servers (same organization, same subsystem, same service code) the server that responds the fastest to TCP connection establishment request is used by a client Security Server. Once a provider Security Server is selected, it will be used for subsequent requests by the client until the TLS session cache expires or a connection attempt fails.

Image 1. “The Fastest Wins” - the server that responds the fastest to TCP connection establishment request is used by a client Security Server.

If the fastest Security Server providing the service quits answering, the client Security Server will automatically change to the second fastest and so on. Connections are evaluated on TCP level only so higher-level application related problems are not taken into account.

Image 2. If the fastest service provider fails, the client will automatically change to the second fastest.

The solution provides high availability, but not scalability from performance point of view as the load is not evenly distributed between all the provider side Security Servers. However, this does not mean that all the different client Security Servers would use the same provider Security Server. The client Security Server prefers the provider Security Server that is nearest network-wise (round trip time is lowest) and the fastest provider varies between different clients. In other words, load generated by different client Security Servers is distributed between different provider Security Servers, but the distribution is not based on a load balancing algorithm so there’s no guarantee that the load is distributed evenly.

Image 3. Load generated by different client Security Servers is distributed between different provider Security Servers.

When relying on internal load balancing adding a new node means installing and registering a new Security Server. Each Security Server serving the same service has its own identity which means that it has its own authentication and sign certificates. In addition, each Security Server providing the same service is stand-alone and there’s no automatic synchronization regarding registered services and/or service level access permissions between Security Servers. Maintaining and synchronizing configuration between Security Servers is a manual task.

Internal load balancing is completely transparent to the client-side information system as the client Security Server takes care of routing the requests, verification of certificates etc. internally. For the client-side information system it’s enough to send a request to the client-side Security Server and it will take care of the rest using the global configuration data provided by the Central Server for discovering Security Servers that provide the requested service. This makes it possible to add new provider Security Servers and/or change the network location of existing provider Security Servers without making any changes on the client-side. High-security environments, where all the outgoing network connections are blocked by default and only connections to whitelisted targets are allowed, are an exception as all the new provider side Security Servers must be explicitly whitelisted on firewall configuration.

External load balancing in X-Road

First, let’s define the meaning of external load balancing. In this context, external load balancing means that a third-party software or hardware-based load balancer (LB) is used for distributing load (LB 1-2) between an information system and the X-Road Security Server or (LB 3) between Security Servers. There are three different use-cases that include an external load balancer:

A load balancer (LB 1) between service consumer(s) and a Security Server cluster.
A load balancer (LB 2) between service providers and a Security Server cluster or a single Security Server.
An external internet-facing load balancer (LB 3) that distributes inbound requests from other Security Servers to a Security Server cluster.

Image 4. An external load balancer can be used in three different scenarios. Different scenarios can also be combined.

The first two scenarios are about distributing load between the Security Server and an information system and they have always been supported by the Security Server. Therefore, we’re not going to concentrate on them now. Instead, let’s take a better look at the third use-case.

The Security Server has supported the use of an external internet-facing LB since version 6.16.0. In this setup an external LB is used in front of a Security Server cluster and the LB is responsible for routing incoming messages to different nodes of the cluster based on the configured load balancing algorithm. Using the health check API of the Security Server the LB detects if one of the nodes becomes unresponsive and quits routing messages to it. The solution provides high availability and scalability from performance point of view.

Image 5. An external LB can be used in front of a Security Server cluster and the LB is responsible for routing incoming messages to different nodes.

A Security Server cluster can have undefined number of nodes which are all active (not hot standby). From client Security Server’s point of view a cluster looks like a single Security Server as all the nodes have the same identity (server code, certificates etc.) and they’re all accessed using the same public IP that is registered to the global configuration as the Security Server’s address. Therefore, a cluster is completely transparent to client-side information systems and Security Servers.

When a clustered Security Server acts as a client and makes a request to an external server (Security Server, OCSP, TSA, Central Server), the external server sees the public IP address. However, the public IP address used for outgoing requests might be different from the one used for incoming requests.

Image 6. When a clustered Security Server acts as a client and makes a request to an external server, the external server sees the public IP address.

One of the nodes is the master node and all the other nodes are slaves. Maintaining a cluster’s configuration is easy, because configuration changes are done on the master and they’re automatically replicated to the slaves. Replication covers the configuration database and configuration files. Changing the configuration on the slaves is blocked. However, all nodes fetch global configuration and OCSP responses independently. In addition, message log database is Security Server specific and it is not replicated between nodes.

Image 7. Serverconf database and configuration files are automatically replicated from the master to the slaves.

Support for an external LB has not been designed some specific LB solution in mind. Any software or hardware based LB that supports HTTP health check and load balancing TCP traffic can be used, e.g. AWS ELB, F5, HAProxy, Nginx. The LB uses the health check API of the Security Server for checking the state of all the nodes in a cluster. The health check API has been available since version 6.16.0 and it can be used for monitoring purposes as well. Health check returns HTTP 200 OK when security server is operating normally, otherwise HTTP 500. The health check API must be enabled manually as it is disabled by default.

The use of an external internet-facing LB also enables dynamic scaling of a Security Server cluster. Dynamic scaling means that the number of nodes in the cluster can be automatically adjusted based on the selected metrics, e.g. CPU load, throughput of incoming requests etc. Scaling can also be done based on a predefined schedule – the number of nodes varies between time of the day and day of the week, e.g. number of nodes is increased during peak hours and decreased for the night. As all the nodes in the cluster share the same identity, adding a new node can be automated because it doesn’t have to go through the normal registration process that requires manual work. Changing the number of nodes can be done creating/deleting nodes or starting up/shutting down existing nodes. Either way, less resources are consumed compared to a situation where enough resources for handling the peak load are running 24/7.

Clustering enables dynamic scaling of Security Servers, but the X-Road does not provide off-the-shelf tools for the implementation as the implementation is platform specific. For example, Amazon Cloud Services (AWS) Auto Scaling and Elastic Load Balancer (ELB) services can be used to implement dynamic scaling and other cloud platform providers (Microsoft Azure, Google Cloud Platform etc.) have similar services.

Which one to use?

Compared to the X-Road's Security Server’s internal load balancing feature, an external load balancer provides better support for high availability and scalability from performance point of view. External load balancer gives the provider side Security Server owner full control of how load is distributed within the cluster whereas relying on the internal load balancing leaves the control on the client-side Security Servers. However, setting up a Security Server cluster is more complicated compared to internal load balancing that is a built-in feature and enabled by default. In addition, an external load balancer brings additional complexity to Security Server version upgrades too as the upgrade process must be coordinated within the cluster. On the other hand, adding new nodes to a cluster is easy as the normal registration process is not required, because all the nodes in the cluster share the same identity. When relying on the internal load balancing each node is independent and has its own identity – adding a new Security Server means that full registration process must be completed.

In addition, external load balancing has one great benefit compared to internal load balancing. The Security Server health check API used by an external LB recognizes situations where a Security Server is running, but PIN code is missing. When PIN code is missing, the Security Server is not able to process messages. Internal load balancing is not able to recognize this situation as it is operating on the TCP level – establishing a TCP connection between Security Servers works even if the PIN code is missing and the provider server is not able to process messages. Therefore, internal load balancing might route messages to Security Servers that are not able to process them because of a missing PIN code. This kind of situation might happen after a Security Server has been restarted, but the administrator hasn’t entered the PIN code yet. Fortunately, there’s an easy solution to the missing PIN code problem. Entering PIN code can be automated using xroad-autologin add-on that can be used together with both built-in and external load balancing.

After discussing different alternatives and combinations the next question probably is when different solutions should be used. The bad news is that there is no single right way to do it. The best solution is always case specific and it varies between different use-cases and information systems. Requirements regarding the availability, scalability and performance of the information system must always be taken into consideration and the solution should be designed based on them. Different alternatives can be and should be used together to provide the best overall solution.

Petteri Kivimäki

12 June 2018

X-Road Logs Explained – Part 3

Petteri Kivimäki

12 June 2018

This is the third post in a series about the X-Road logs. The first part was about different log types (technical logs, business logs, audit logs) and the X-Road logs in general. The second part concentrated on the X-Road business log which contains all the messages processed by a Security Server. The third part is about how to provide access to the logs of who has accessed my data and when.

Our data out there

Our personal information is processed by numerous different information systems on a daily basis. Some of the processes are fully automated and other include human actions. Wouldn’t it be nice to know for which purposes your personal information is used, when and by whom? Getting access to this information is already possible, at least in theory, but it requires requesting the information from each registry owner separately. Once you get a response, its format (printed document, email, structured data etc.), level of detail and delivery method (post, email, API etc.) depend on the registry owner, because there’s no unified way of providing this information. In practice, it is currently impossible for a citizen to get a good overall picture regarding the usage of his/hers personal information. There must be a better way.

As you all know by now, the X-Road message log contains all the business events processed by a Security Server. Non-repudiation of the message log is guaranteed using time-stamping and digital signatures. Could the message log be used for providing access to the information regarding who has accessed my data and when? Let’s find out.

X-Road message log to rescue

The message log contains all the required information in a machine-readable format so it might provide a solution to our problem. However, message log is Security Server specific so when a service can be accessed through multiple Security Servers, which is very common in a highly available setup, message log entries are distributed between multiple Security Servers. This means that information must first be collected from multiple sources and then combined. Collecting and combining the information requires shell access to Security Server as there’s no API for querying the logs based on the message content. In practice, collecting and combining the information must be done manually by a Security Server administrator.

Another thing that must be considered is the archiving of the Security Server message log. By default, messages are archived from message log database to disk after 30 days and therefore there must be an additional method for searching data from the archived message log files. This can be done using standard Linux command line tools, but it’s not very efficient when the amount of archived log data is big. In addition, it’s not recommended to keep archived logs on the Security Server so access to a separate long term storage is required and the archived logs might even be encrypted.

Too good to be true?

It seems that all the required information is there in the message log and with some new functionalities the information could be made accessible in a way that manual work is no longer required. Sounds like a good solution if personal information which usage needs to be logged is accessed through the X-Road only. In real life, this is rarely the case. Usually personal information is accessed through multiple channels (information system’s own UI, mobile apps, p2p integrations using native APIs etc.) and the X-Road is just one of them. The message log contains information about the messages processed by a Security Server, but all the other channels are excluded. From a citizen’s point of view this kind of partial solution is not sufficient and usage logs must contain the access information from all the different channels. To be able to log all the events processed through different channels, an alternative approach is needed.

The technical challenges mentioned above could be solved using a centralized system for storing usage logs. In general, it is a common practice for an organization to have a centralized log management system that contains logs from organization’s all the information systems. A log management system could be designed so that it provides a separate index/table/container for information related to access to personal information. In addition, a common access log format, that all information systems would use, should be defined and implemented. However, unlike Security Server’s message log, information systems’ logs are not usually signed and time-stamped so non-repudiation is not guaranteed.

Accessing usage logs

Implementing a centralized logging system on organizational level would solve half of the problem - collecting and storing the access logs. The other half, providing citizens access to the usage logs is yet to be resolved. Each organization could build their own solution for viewing the data, but for a citizen getting an overall picture regarding different services would require accessing many different online services and websites. A better solution would be to provide access to all the usage logs from one centralized service or portal, e.g. state portal.

Image 1. Centralized access to usage logs.

Centralized access to all usage logs could be implemented in a distributed or centralized way. The distributed way means that each organization stores its own usage logs and the data is fetched on request only, e.g. when a citizen logs into the state portal and wants to see the access logs of a specific data source. The centralized way means that usage logs are regularly harvested from organization specific storages and stored in one central usage log storage. Both alternatives have their pros and cons, but their further analysis is out of scope of this blog post. For a citizen both alternatives would provide the same result – access to all the usage logs from one place.

Access to the logs through state portal could be implemented using the X-Road. Each organization would implement a common interface for accessing and/or harvesting logs. In this way, even the access to the access log would be logged and transparent to citizens.

Of course, in addition to architectural and technical questions there are many other questions regarding the content and format of access logs that would have to be commonly agreed, e.g. how fine grained the logging should be, how detailed the descriptions should be etc. Failing in this area could make the result very confusing and even misleading for citizens. Badly implemented, the result might even do more harm than good. Therefore, instead of talking about technical details it would be better to concentrate on the targeted outcome from citizen’s point of view first.

Back to the topic

Back to the earlier question – could the message log be used for providing access to the information regarding who has accessed my data and when? Basically yes, but a partial solution would not bring very much value to citizens. In addition, development of new features would be required to remove the manual work regarding handling of the logs. In practice, the message log alone is not a sufficient solution, but it can definitely be used as a part of a wider solution discussed before. Therefore, the message log alone cannot be used for providing access to the information regarding who has accessed my data and when.

That’s it about the X-Road logs for now. There is more to come later...

Petteri Kivimäki

4 June 2018

X-Road Logs Explained – Part 2

Petteri Kivimäki

4 June 2018

This is the second post in a series about the X-Road logs. The first part was about different log types (technical logs, business logs, audit logs) and the X-Road logs in general. The second part concentrates on the X-Road business log which contains all the messages processed by a Security Server – the message log.

Background

The original idea behind the message log was to store a tamper-proof machine-readable evidence of every message processed by a Security Server. By guaranteeing non-repudiation of log entries using digital signatures it was possible to provide an undeniable evidence of each transaction. Storing the logs in a unified machine-readable format made it possible to use them in automated processes. In a wider picture, the logs would allow reducing manual work, increase the level of automation in various processes and make things easier for both users of different information systems and citizens.

Another important aspect was to implement some commonly needed features such as logging of business events in an off-to-shelf component that everyone would use. In this way, there was no need to implement the same feature for all the information systems separately. Of course, potential benefits of this approach depend on the starting point of the ecosystem as nowadays logging of business events is required from all the production level information systems. However, the format of logs is not unified between different information systems and not all the systems guarantee non-repudiation of data.

Message log today

Originally, the aim of the X-Road was to provide a secure and standardized way to exchange data that guarantees non-repudiation of the data and provides the evidence in unified machine-readable format. Today, in 2018, the core functionality of the X-Road version 6 can still be described using the same words even if version numbers and technical implementation details have changed many times over the years.

The X-Road version 6 guarantees non-repudiation of the data sent via the X-Road using time-stamping and digital signatures. All the evidence is stored in the message log database from where it is archived to disk using associated signature containers (ASiC) for eIDAS. Security Server owner can access active log records stored in the message log database using a web service interface. Once log records have been archived, accessing them requires shell access to Security Server. No external parties have access to the message log. The X-Road itself is used as a data exchange layer in automated processes between different organizations and information systems, but the message log is not currently used for automation purposes. If something needs to be checked from the message log, manual work from Security Server administrator is always required.

Nowadays Security Server provides a feature that makes it possible to disable logging of message payload that contains the actual business data. This means that message payload is dropped before logging and only message headers with an empty payload are logged. However, time-stamping and signing of messages are always done using the original message which means that it is impossible to verify the signature afterwards as it is created using the original message and message log contains only message headers. Message hash in the signature and message hash calculated using the logged message will never match as the logged message does not contain the payload. This means that all the evidential value of the message log is lost, and it can be used for reporting and statistical purposes only.

To log or not to log?

Why to disable logging of message payload if the evidential value of the message log is lost? The answer lies on the logical architecture and the type of data that is exchanged. Is Security Server used for exchanging personal data or other sensitive data? Is Security Server seen as a part of the information system that is using it to exchange data or is Security Server seen as a separate, external information system that is integrated with the information system that is using it to exchange data?

Type of data that is exchanged is important, because there are rules and restrictions regarding how personal data and other sensitive data must be handled and processed. In case of personal data, depending on the jurisdiction, the message log may form a person registry when message payload logging is enabled. This means that Security Server must be compliant with technical and non-technical requirements regarding processing of personal data which might differ between different countries and ecosystems. In addition, the interpretation of different legal requirements might vary as well.

When Security Server is seen as a part of an information system containing a person registry, Security Server is one of the system components and therefore personal data stored in the message log remains inside the system boundaries. It is enough that Security Server meets the technical requirements and all the applicable maintenance and operating processes are followed. Instead, when Security Server is seen as a separate, external information system, message log may become an additional person registry and the purposes of processing personal data of the information system that is using the Security Server to exchange data cannot be applied to it anymore. In this scenario disabling logging of message payload can be used to prevent the creation of an additional person registry.

What is the most logical interpretation regarding Security Server’s role with respect to the information system? If we think about Security Server as a message mediator or a message proxy it is easy to see it as a part of the information system rather than an external system. Let's think about the question in more general level. For example, how modern microservice based systems are structured – usually they consist of multiple independent services that communicate with each other through APIs. All the individual microservices are part of the same system and there’s only a single person registry even if not all personal data is stored in the same physical storage. In addition, all the enterprise level information systems consist of multiple components that are located in different physical or virtual hosts, and usually they’re seen as a one system. So why should Security Server be any different?

What next?

Have the original ideas how the message log could be used for process automation become true and is the message log’s full potential already reached? Answer to both questions is no, which means that the X-Road could provide even more value to its users than it currently does. Message log records could be used in process automation and for implementing new business features. When logging of message payload is enabled the required data is already there in the message log, but utilizing it for new use cases will require some additional development to make it more accessible than it is now. However, the required effort is small compared to the potential value that could be created.

In the third part of the series about the X-Road logs I’m going to discuss how to provide access to the logs of who has accessed my data and when. Until then.

Petteri Kivimäki

1 June 2018

Changes in the X-Road Development

Petteri Kivimäki

1 June 2018

The X-Road was originally developed by the Estonian State Information Systems Department (at the Ministry of Economy and Communications) and the first version was launched in 2001. In Finland the Suomi.fi Data Exchange Layer service that's based on the X-Road was published in November 2015. Today Finland's and Estonia's data exchange layers are connected to one another which enables cross-border data exchange between the countries. Estonia and Finland have also been developing the X-Road core together since 2015. Now the development is handed over to the NIIS.

First steps

The cooperation between Estonia and Finland officially started in 2013 when the Prime Ministers of Estonia and Finland, Andrus Ansip and Jyrki Katainen, signed the Memorandum of Understanding about the cooperation in the field of ICT. In the beginning of 2014 Estonia gave the X-Road to Finland under EUPL licence and later that year the Finnish X-Road implementation project was kicked off.

The cooperation between Estonia and Finland was not limited to handing over the source code as the countries started to develop the core of the X-Road together. Both countries wanted to share the same X-Road core and maintain the interoperability between X-tee and Suomi.fi Data Exchange Layer to enable cross-border data exchange between Estonia and Finland. This meant that the joint development of the X-Road needed to be coordinated.

Deepening the cooperation

Finland's Population Register Centre and the Republic of Estonia's Information System Authority were responsible for the coordination of the X-Road core development and a set of practices and guidelines were agreed for managing the cooperation. Another important outcome of the collaboration was publishing the source code of the X-Road core as open source under the MIT licence. The source code was published in two parts in 2015-2016 and it was made publicly available on GitHub.

Shared organization

The next step of the cooperation was taken in June 2017 when the NIIS was founded. Now, in June 2018, the NIIS is taking over the X-Road core development from Finland's Population Register Centre and the Republic of Estonia's Information System Authority. The first step of the handover was already completed earlier this year when the NIIS took the responsibility of running the Working Group that is the platform for day-to-day coordination of the joint development. Now it’s time for the NIIS to take over the management of the source code of the X-Road core as well. In practice, this means updated joint development practices and transferring all the related source code repositories on the NIIS’s GitHub account.

What will change?

First of all, for the X-tee and Suomi.fi-palveluväylä member organizations nothing will change. Finland's Population Register Centre and the Republic of Estonia's Information System Authority are responsible for running their national ecosystems and providing all the same support services to their members as so far.

What will change is the joint development model and the locations of the X-Road core and few additional components’ source code repositories. The new joint development model can be found at:

https://github.com/nordic-institute/X-Road-development/

From the X-Road community’s point of view one of the biggest changes is making the X-Road backlog public. Anyone can access the backlog, and leave comments and submit enhancement requests through the X-Road Service Desk portal. Accessing the backlog and service desk requires creating an account which can be done in few seconds using the signup form.

Starting from 1st June 2018 new locations of the X-Road core source code repositories are:

https://github.com/nordic-institute/X-Road

https://github.com/nordic-institute/X-Road-tests

https://github.com/nordic-institute/X-Road-tests-environment

Starting from 1st June 2018 new locations of the additional components’ source code repositories are:

https://github.com/nordic-institute/REST-adapter-service

https://github.com/nordic-institute/xrd4j

All the X-Road implementers and developers, please update your remote master repository’s URL to the new master today. From now on all the pull requests and contributions must be submitted to the NIIS managed X-Road master repository. The previous repositories will remain available on GitHub, but they're no longer updated and changes from the new master will not be synced to them.

What next?

The NIIS will continue to develop the X-Road open source technology and welcomes all the interested parties to participate in the development.

The global X-Road Community will meet on 12th September in Tallinn. The event will provide an excellent opportunity to learn more about the X-Road and meet X-Road enthusiasts from all around the world.

Stay tuned!

Petteri Kivimäki

28 May 2018

X-Road Logs Explained – Part 1

Petteri Kivimäki

28 May 2018

Like any other application, X-Road is producing logs that contain information about how different components of the application are operating. Application logs can be used for multiple purposes, e.g. monitoring, debugging error conditions, verifying transactions etc. In general, application logs can be divided in three different groups: technical logs, business logs and audit logs. X-Road produces logs in all the three categories.

Technical logs

X-Road consists of three main components: Central Server, Configuration Proxy and Security Server. Each of these components is internally divided into several lower level components or modules – some of which are shared between the main components. For example, Security Server’s main components are proxy, signer and confclient. Proxy is a Security Server specific component whereas signer is used by all the main components. When it comes to technical logs each component has its own log file where it writes information regarding its operations. The level of detail can be configured using configuration files – each lower level component has its own configuration file. Technical log files are stored in /var/log/xroad directory. More information about the technical logs can be found at:

https://github.com/ria-ee/X-Road/blob/develop/doc/Manuals/ug-ss_x-road_6_security_server_user_guide.md#17-logs-and-system-services

Business logs

Security Server’s business log is stored in the message log database which contains all the messages processed by the Security Server. Each message is time-stamped and signed which makes it possible to verify the message content afterwards. However, verifying a message requires that the message payload has been logged – logging message payload can also be disabled. By default, time-stamped messages are archived from the database to disk every six hours. Time-stamped and archived messages are kept in the message log database for 30 days until they are removed automatically. Different intervals regarding archiving of messages can be configured through configuration files. More information about the message log can be found at:

https://github.com/ria-ee/X-Road/blob/develop/doc/Manuals/ug-ss_x-road_6_security_server_user_guide.md#11-message-log

Audit log

Security Server’s audit log is stored on the disk and it contains information about all the actions completed by an administrator through the Security Server UI. In this way all the actions that change the configuration or state of the Security Server are logged and they can be traced afterwards. The audit log is stored in /var/log/xroad directory. More information about the audit log can be found at:

https://github.com/ria-ee/X-Road/blob/develop/doc/Manuals/ug-ss_x-road_6_security_server_user_guide.md#12-audit-log

Archiving logs

It is worth mentioning that all the logs produced by the Security Server are local – also in clustered environments. This means that log records are not replicated inside Security Server cluster. In addition, log records are not automatically transferred to an external host or log storage for archiving purposes.

In most cases there are both business and legal requirements regarding the retention period of different types of logs. X-Road produces the logs, but it is the administrator’s responsibility to configure the transfer of the logs to a long-term storage. This is strongly recommended for saving hard disk space and avoiding loss of log records during Security Server crash. Therefore, it is a good idea to transfer the logs to a centralized logging system or log storage rather than storing them locally on Security Server.

Technical logs and audit log can be configured to be redirected to an external location using rsyslog. In this case a batch type of transfer is not required as rsyslog forwards the log messages nearly real time. However, is not possible to use rsyslog for transferring archived message log record files so a batch transfer must be used. This can be implemented using a transfer script shipped with the Security Server or using rsync.

Please don’t forget disk space

Another reason why it is highly recommended to transfer log records to an external storage rather than storing them locally is hard disk space that they consume. Especially message log database and message log archive files may consume a significant amount of disk space on high traffic Security Servers.

Each SOAP request contains 9.5 kB of metadata (SOAP namespace definitions, headers, signatures) and response message contains around 11.2 kB of metadata. Based on this, each successful query produces around 21 kB of metadata in the message log. In addition, once in a minute security server batch time-stamps all the messages processed since the previous time-stamp operation which generates additional 3.6 kB of metadata per minute.

Message log space requirements can be estimated using the formula below:

3.6kB + N * (21kB + R + A) = S

N = number of requests per minute
R = size of request payload in kB
A = size of response payload in kB
S = disk usage per minute (kB / min)

Example

Let's assume that a system receives 100 requests per minute. Request payload size is 4 kB and response payload size is 8 kB.

3.6 kB + 100 * (21 kB + 4 kB + 8 kB) = 3303.6 kB / min = 3.3 MB / min

This makes 4.8 GB per day and 143 GB per month.

In addition, archived message log records require around 43 % of the space required by the active message log records so the required disk space in total is:

4.8 + 2.1 GB = 6.9 GB / day
143 + 61 GB = 204 GB / month
1716 + 738 GB = 2454 GB / year

When looking at the example above it is easy to understand why handling and archiving of log records should be planned in advance. Message log may generate a huge amount of data and no one wants to end up in a situation where a full disk causes a service break in a critical business system.

What next?

Many of you may ask what’s the sense of collecting and archiving all these logs. Wouldn’t it be easier to free up disk space by just deleting them? The features related to logging are not a coincidence and they are there for a reason. In the next blog post regarding the X-Road logs I will give more insights on the background of the message log and why it’s working the way it is.

Petteri Kivimäki

14 May 2018

X-Road REST Support Workshop

Petteri Kivimäki

14 May 2018

The NIIS organized an X-Road REST support ideation and planning workshop on 8th May in Tallinn. The workshop was targeted for the participants of the previously implemented REST survey who expressed their interest towards the workshop and further involvement in the planning process. The aim of the workshop was to get more insight on implementing REST support from real X-Road users.

Image 1. Beginning of the REST support workshop.

Workshop agenda

The workshop had around 20 participants from Estonia and Finland who were representing different public and private sector organizations. The program of the workshop was built around three themes:

X-Road REST support survey results
REST support implementation alternatives
Next generation X-Road

The format of the workshop allowed the participants to have time for discussions and group working. Each section started with an introductory presentation on the topic which was followed by a group assignment and a joint discussion.

X-Road REST support survey results

The first part concentrated on the X-Road REST support survey results. Overall, the participants’ opinions were aligned with the survey results – basic support which means consuming and producing SOAP and REST services using their native implementations is enough. Automatic conversion between different service types is not required.

The most intensive discussion was about defining REST in the context of the X-Road. Unlike SOAP that is a protocol with a detailed specification, REST is an architectural style consisting of the best practices and guidelines. Instead of talking about REST in general it should be defined in more detail what does supporting REST actually mean – in the X-Road’s case a loose definition would be supporting JSON and/or XML over HTTP. Obviously, more detailed guidelines regarding the API provided by the X-Road will be defined during the next steps of the process.

Another hot topic was service descriptions of REST services. The discussion was about the technique or language used for producing the descriptions and if service descriptions should be optional or mandatory. The current idea is to produce REST service descriptions using the OpenAPI Specification (OAS), but more insights about different alternatives was requested by the participants. REST service descriptions should be mandatory just like SOAP service descriptions currently are. Their role is essential for a service consumer and their quality can make a huge difference on a time that is required for implementing a client application consuming the service. Till what extent the X-Road will validate the content of service descriptions, will be defined later.

REST support implementation alternatives

In the beginning of the second part two alternative basic implementation approaches were presented to the participants whose task then was to comment on them and/or present their own ideas and visions. In addition, a list of open issues to consider was given to the participants as an input for the assignment, e.g. how query parameters are handled, how HTTP headers are handled, how to transfer X-Road specific request/response data, which parts of the message must/should be signed etc.

Image 2. Current security server architecture (simplified). — Image 2. Current security server architecture (*simplified*).

The first approach was about adding an additional rest proxy component to the security server that just wraps REST messages inside SOAP. This approach does not require changing the X-Road transport message protocol, but it adds more overhead to REST messages compared to SOAP.

Image 3. Approach 1 - additional REST proxy component.

The second approach was about changing the X-Road message transport protocol to support generic message payloads instead of the current SOAP payload. This approach makes it easier to support other message formats in the future and it also makes processing REST messages faster compared to SOAP, but it requires major changes to X-Road transport message protocol and many existing key components.

Image 4. Approach 2 - changes to the X-Road message transport protocol.

The assignment generated many new ideas and approaches. Some groups used one of the two presented alternatives as a starting point and then there were groups that defined their own approach from clean table. Some approaches required changes to existing security server proxy components and others were based on adding new components and a parallel communication channel between security servers. The most radical idea was to implement security server as a software library and redesign the X-Road communication model. However, two things were commonly agreed: 1) REST support should be native and implemented on the transport protocol level and not wrap REST messages inside of SOAP and 2) the changes must be backwards compatible and they should not affect existing SOAP services.

Next generation X-Road

In the beginning of the third part the NIIS technical roadmap for 2018 and some possible future enhancement ideas were presented. Then it was the participants turn to share their vision regarding the future of the X-Road.

The main theme was to make the use of the X-Road easier and streamline both member and security server registration process. Developers were not forgotten either - a developer version of the security server that can be set up in minutes and that does not require registration was discussed too.

When it comes to the technology, supporting containers and better support for cloud platforms were on top of the wish list. Blockchain – a technology that the X-Road is not based on or that is does not currently use internally – was also discussed as an alternative for distributing global configuration. In addition, possibility to join the X-Road using a custom endpoint based on a software library implementing the required protocol stack instead of the official security server software package was discussed too.

To summarize, the next generation X-Road should be a well maintained and interoperable, platform independent solution.

What next?

The outcome of the workshop exceeded the expectations clearly. The workshop provided active discussion, intensive group working, fresh ideas and valuable input for the next phase of the planning process. A big thanks to all the participants.

The NIIS will continue planning in more detailed level, based on the input received from the workshop. New blog posts regarding the topic and the next steps will be published during the next months.

Petteri Kivimäki

3 May 2018

X-Road REST Survey Results

Petteri Kivimäki

3 May 2018

The Nordic Institute for Interoperability Solutions (NIIS) did a survey regarding REST support for the X-Road. The results don’t leave any room for doubts – 93 % of the participants want the X-Road to support REST. Only 7 % of the participants are not interested in REST at all. The X-Road community has spoken.

About the survey

The survey was open from 19th March till 13th April and it was promoted through different social media channels of the NIIS. In addition, all the Estonian X-tee members and Finnish Suomi.fi Data Exchange Layer (Suomi.fi-palveluväylä) members were invited to answer the questions. The survey collected 75 responses of which 32 are from the Suomi.fi Data Exchange Layer members, 24 are from the X-tee members and 19 are from other parties interested in the X-Road.

Summary

One thing is sure – the answers show without a doubt that REST support is wanted by the X-Road community. According to the majority of the participants it’s enough to support consuming and producing services using their native implementation. It is also enough to have the service descriptions available based on the native implementation of the service – WSDL or OpenAPI specification. Automatic SOAP-REST conversion is not expected which means that if a service provider wants to provide both SOAP and REST versions of the same service the provider must implement both versions.

In many questions the differences between answers are not big and it must be also considered that part of the participants did not express their opinion. If all the “I don’t know / I’m not sure” answers were added to the answer that came second, the first and the second answer would be even. This does not change the fact that REST is wanted, but it has an effect on the level of the support that is expected. On the other, if all the “I don’t know / I’m not sure” answers were added to the answer that came first, the difference would be even clearer than it is now.

It is also worth noting that more than half (61 %) of the participants is interested in how the X-Road handles SOAP and REST messages internally. In addition, the information also effects on their decisions when planning how to integrate their services. Based on this it is not indifferent how the REST support is implemented.

Results

The questions and their answers are presented below.

Image 2. Would you like to consume or produce REST services through X-Road?

Image 3. What type of REST services (CRUD) you would like to consume or produce?

Image 4. Should all the services be available using both SOAP and REST regardless of their native implementation?

Image 5. Should all the service descriptions be available in WSDL (SOAP) and OpenAPI specification (REST) regardless of the native implementation?

Image 6. Are you interested in the details how X-Road handles SOAP and REST messages internally?

*Image 7. Should REST version of a SOAP service and SOAP version of a REST service be automatically provided by X-Road?*

What next?

The survey provided more insight regarding expectations for the X-Road REST support and the results serve as a great input for the next phase of planning.

The NIIS will organize an X-Road REST workshop in Tallinn in May. The workshop is targeted for the participants of the survey who expressed their interest towards the workshop and left their contact information when they completed the survey. More information regarding the REST support and the outcome of the workshop will be available in May. The road towards REST will continue.

Petteri Kivimäki

26 April 2018

There is no blockchain technology in X-Road

Petteri Kivimäki

26 April 2018

Recently there have been multiple writings about the X-Road which have stated that X-Road is a blockchain based technology or it utilizes blockchain internally. Are these claims true, is X-Road based on blockchain? Let’s take a look at the facts.

Blockchain

Blockchain is one of this year’s buzzwords and one of the hottest technologies out there. Blockchain became known as the technology behind bitcoin – the first cryptocurrency launched in 2009. Since then its use has expanded to cover many different business areas and use cases in addition to cryptocurrencies.

Blockchain is a distributed, decentralized and public database that stores transactions in a chain that protects them against alterations and ensures data integrity. Blockchain is a peer-to-peer network where all the nodes are equal and every node has a full copy of the blockchain. The data stored in a blockchain cannot be altered afterwards without altering all the subsequent blocks and replicating the changes to all the nodes of the network. This makes tampering the data stored in a blockchain extremely difficult.

Transaction data is stored in blocks in the form of a Merkle tree. Consecutive blocks are linked to each other so that together they form a chain. Each block contains the cryptographic hash of the preceding block in the chain which makes it possible to verify the order and the integrity of the blocks from the previous block till the very first block of the chain, the genesis block. This makes it possible to audit and verify all the transactions in the chain.

Blockchain does not have a central authority so the nodes need to come to a consensus before a new block can be added to the chain. This is achieved by using a consensus protocol or consensus mechanism. The most common consensus mechanisms are called Proof of Work (PoW) and Proof of Stake (PoS).

X-Road

X-Road is an open source data exchange layer solution that enables organizations to exchange information over the Internet. X-Road is a centrally managed distributed integration layer between Information Systems that provides a standardized and secure way to produce and consume services. X-Road ensures confidentiality, integrity and interoperability between data exchange parties.

The identity of the service producers (i.e. base registries) and consumers (i.e. web portals) is maintained centrally by the X-Road operator and all the data is exchanged directly between the data consumer and provider. All the evidence regarding the data exchange is stored locally by the data exchange parties, and no third parties have access to the data. Time-stamping and digital signature together guarantee non-repudiation of the data sent via X-Road.

X-Road supports batch signatures and batch time-stamping. Batch signatures are created for messages that contain attachments. Batch time-stamping means that time-stamps are created asynchronously in batches for all the messages that have been processed since the last batch time-stamping. Batch time-stamping is used for reducing the load of time-stamping service. Both these features are based on Merkle hash trees and the messages processed during a single batch are linked to each other through a hash chain. Using the hash chain it is then possible to verify that a selected message is a part of a certain batch signature. However, there is no link between different batches and messages in them so there’s no chain that would link all the batch processed messages together.

The security server stores all the processed messages with their signatures and time-stamps in the message log database. The log records are archived to disk regularly and removed from the database after that. In the message log archive file each message has a cryptographic hash that depends on the previous message that was archived and the chain continues over different archive files. In this way the message log archive files create a chain that contains all the messages processed on a single Security Server. This means that the messages stored in the message log archive files cannot be modified without breaking the chain.

Is X-Road based on blockchain?

Blockchain is a decentralized and distributed database which is updated through a consensus protocol. All the nodes of the network are equal and every node has a full copy of the database. The blocks stored in the database are linked to each other using cryptographic hash functions.

The X-Road Security Server message log archive files contain all the messages processed by a single Security Server. Messages included in the files are linked to each other using cryptographic hash functions. The files are stored locally and the server hosting the files is responsible for creating them. Each Security Server has its own unique chain of processed messages. Other members of the X-Road ecosystem do not have access to the files.

The common factor between blockchain and X-Road is that they both use cryptographic hash functions for linking data items to each other. Besides that there are very few common factors between the two as they serve very different purposes and use cases. Cryptographic hash functions existed well before blockchain so even if the both blockchain and X-Road use them does not mean that X-Road is based on blockchain. Both bicycle and car have wheels, but we don’t say that car is based on bicycle just because the bicycle was the first one to use wheels. The same goes with blockchain and X-Road.

Based on the arguments presented above the outcome is that X-Road is not based on blockchain and does not use it internally.

Petteri Kivimäki

3 April 2018

X-Road and REST

Petteri Kivimäki

3 April 2018

X-Road is an open source data exchange layer solution that enables organizations to exchange information over the Internet. More information about the X-Road is available at:

https://www.niis.org/data-exchange-layer-x-road/

The X-Road and REST have been a topic in public discussion for quite some time already. Today the X-Road does not have a built-in support for REST, but that does not mean anything has happened regarding the topic in the recent years. And there’s even more to come later this year...

At the moment REST services can be produced and consumed over the X-Road using the REST Adapter Service component. The service supports a limited set of use cases so it’s not an answer to all X-Road REST integration questions. However, it is an off-the-shelf component that provides an X-Road compatible REST-SOAP converter, and it can be implemented over configuration – no coding is needed. Compared to a custom-built solution it can save a great deal of effort.

https://github.com/vrk-kpa/REST-adapter-service

How does it work today?

X-Road message exchange protocol is based on SOAP and all the information systems and services that exchange data over the X-Road must implement the protocol. For older SOAP based information systems and systems that have been using the X-Road for years this is not a problem as the systems already have a working implementation of the X-Road message exchange protocol.

For new information systems and new X-Road user organizations things might not be that simple because nowadays the most APIs are RESTful in nature and use JSON instead of XML. This means that an additional REST-SOAP adapter service must be implemented between the information system and the X-Road Security Server. The REST Adapter Service component is one alternative or organizations may implement their own custom-built solutions. Either way, it is technically doable, but not a very compelling alternative for organizations that have already moved away from SOAP and have implemented their APIs using REST and JSON. In addition, all the extra components in the stack bring more overhead, delay, maintenance work, costs etc.

What do the numbers say?

ProgrammableWeb is one of the largest information and news sources about the Web as a programmable platform and it maintains a directory of over 15,500 web APIs. According to its statistics REST is the most common architectural style with the share of 81 % of all the APIs listed in the ProgrammableWeb’s API directory. At the same time RPC’s share, that’s including also SOAP, is only a bit under 9.5 %. These numbers give a good overall picture regarding the popularity of different architectural styles today. The whole article is available at:

http://web.archive.org/web/20220712162418/https://www.programmableweb.com/news/which-api-types-and-architectural-styles-are-most-used/research/2017/11/26

What next?

Based on available statistics and the public discussion in the recent years, it seems obvious that the X-Road needs a better support for REST than the REST Adapter Service is able to provide now. The lack of the REST support is slowing down the adaption of X-Road and generating unwanted additional work for many X-Road user organizations. However, adding support for REST does not mean dropping support for SOAP – not any time soon, at least. Instead, the two architectural styles can co-exist side by side which means that all the current SOAP services must be supported also after the REST support has been implemented. Then it will be another discussion for how long both SOAP and REST must be supported side by side.

Tell us what you think!

What do you think about the X-Road and REST? The Nordic Institute for Interoperability Solutions (NIIS) is doing a survey regarding REST support for the X-Road. The survey is open until the 13th of April, tell us what you think:

https://buff.ly/2G7wtl9