From The Blog

July 27, 2021
July 27, 2021

Decrypting Enclaves: Encryption Key Hierarchy

Intel SGX is an implementation of a Trusted Execution Environment (TEE): an environment where data integrity and confidentiality and code integrity are protected by hardware-based encryption. Intel SGX isolates specific application code and data in a secure part of the module’s memory, called “enclaves.” 

Intel SGX is an implementation of a Trusted Execution Environment (TEE): an environment where data integrity and confidentiality and code integrity are protected by hardware-based encryption. Intel SGX isolates specific application code and data in a secure part of the module’s memory, called “enclaves.” Conclave makes it very easy for you to interact with enclaves in higher languages such as Java and Kotlin. Before we dig deep into Conclave, let’s talk a bit about the different encryption keys used throughout an enclave’s lifecycle. 

An enclave uses “keys” as the root of trust instead of using certificates. They are the foundation for the software chain of trust. 

There are multiple keys used at multiple points during an enclave’s lifecycle. To start with, Intel issues two key pairs which are fused into the CPU at the time of CPU manufacturing. All other keys are derived from these two keys. Keys are used in the attestation process to prove that a particular piece of code is running on a trusted SGX-enabled CPU. This proves the authenticity of the enclave on which your code runs. Keys are also used during sealing to encrypt your data. Now let’s dive into the different keys used for the above-mentioned scenarios.

Root Provisioning Key: This key is fused into the Intel processor by Intel at the time of CPU manufacturing. Intel acts as a certifying authority (CA) and issues a digital certificate identifying this SGX-enabled CPU’s identity. This key is securely stored in an HSM within a facility managed by Intel and allows Intel to verify that the CPU is a genuine SGX-enabled part during remote attestation.

Root Sealing Key: This key is also baked by Intel into the CPU. This key is a unique key known only to the CPU and is used for sealing.

Let’s take a look at some common structures used by Intel SGX enclaves.

MRENCLAVE: This is the enclave’s identity, a SHA-256 hash of the log that records all activity while the enclave is being built. This log consists of code, data, heap, stack, and other attributes of an enclave. Once the build is complete, the final value of MRENCLAVE represents the identity of the enclave. In short, it’s the hash of the enclave code and initial data.

MRSIGNER: Each enclave is also signed by its author. MRSIGNER contains the hash of the public key of the author.

REPORT: The enclave creates a REPORT structure consisting of MRENCLAVE, MRSIGNER, and additional enclave attributes.

Enclaves cannot access these root keys directly; instead, they use a derivation function to derive them. The derivation function uses MRENCLAVE, MRSIGNER, the current CPU microcode security version number and other enclave attributes to derive the keys. A nonce is also given as an input to this function to add entropy. Usually this is a password specific to the owner of the system, allowing the owner to cryptographically destroy any data sealed by the system when it is deprovisioned.

Below are examples of keys that enclaves can derive.

Report Key: This is derived from the root seal seal key and is used during attestation. The enclave creates a REPORT structure consisting of MRENCLAVE, MRSIGNER and some more enclave attributes. An enclave is required to prove its authenticity and code integrity to either a client or an enclave running on a different system (remote attestation) or to another enclave running on the same system (local attestation). For local attestation, the host obtains a report from the enclave. This report contains the MRSIGNER, MRENCLAVE and other report attributes, including some user data which is normally a public key for communicating with the enclave. This report is actually the same for local and remote attestation. The report structure is then signed inside the enclave using the REPORT key. The hash of the REPORT data is encrypted with the REPORT key, and the signed report is then sent into another enclave on the same system via a host process. The other enclave has access to the same REPORT key so that it can generate a new hash over the report, decrypt the hash that the other enclave used to sign, then ensure the two are the same. 

Provisioning Key: This is derived from the root provisioning key. This key is used by the CPU to identify itself to Intel for attestation. Remote attestation proves to a client that the correct code is deployed on a fully patched, SGX-enabled enclave. The attestation service could be either EPID or DCAP. 

For both types of attestation, the current host (the process on which a current enclave is loaded) sends the REPORT to another enclave called a Quoting Enclave which verifies the signed REPORT (signed using the report key) and signs it using the provisioning key.

For Enhanced Privacy Identification (EPID), the REPORT signed by the quoting enclave (called the “quote”) is sent to the Intel Attestation Service (IAS) by the host or the client, depending on the application (for Conclave, this is always the host). The IAS then verifies the quote and signs it with a certificate that is rooted within the Intel SGX root of trust. This Intel-signed verification can then be checked by the host and sent to a client, who only needs to check to see if the certificate is valid and trusted. This is mostly a legacy approach.

Data Center Attestation Primitives (DCAP) works a bit differently. The quote is signed using the same provisioning key as before, however, rather than sending the quote to Intel for verification, the host requests information about the platform from the Intel Provisioning Certification Service (PCS). This information is called “collateral” and includes Intel-signed information about the platform that can be used to verify the quote without sending it to Intel. The report/quote and the collateral are sent to a relying part (the client) which can then verify the collateral against the trusted Intel root certificate and verify the quote against the collateral. This is a newer approach focused on data centers and cloud service providers. This provisioning is based on ECDSA signatures which allow for construction of on-prem attestation services.

Seal Key: This key is derived from the root seal key. The memory used inside an enclave is encrypted using hardware and is isolated from other processes and applications. When the enclave stops, data in memory cannot be recovered. Sealing is a technique used to encrypt and export data outside the enclave without compromising data confidentiality or integrity. From outside it can be transmitted over a network or saved to an external storage location like an external database or hard disk. Sealing keys can be derived from MRENCLAVE or MRSIGNER for storing data.

Sealing to the MRENCLAVE makes the key available to any instance on the same physical system having the exact same MRENCLAVE. This will not allow any future software to read secrets of this enclave. Sealing to MRSIGNER will make the key available to any enclave running on the same physical system which has been signed by the same author, making the upgrades much easier. This will allow newer enclaves to read secrets of older versions but will require clients to trust the enclave signer. This trust can be gained through a defined enclave audit process. Sealing keys are only known to an enclave, so that only an enclave can decrypt the data.

Want to learn more?

Here are some helpful resources to learn more about Conclave and Confidential Computing:

July 06, 2021
July 06, 2021

The Why and How of Confidential Computing

R3 recently launched Conclave, a new confidential computing platform that allows multi-party data to be pooled privately and securely.

R3 recently launched Conclave, a new confidential computing platform that allows multi-party data to be pooled privately and securely.

The Data Lifecycle

When it comes to data protection and the data lifecycle, there are three states:

    1.   Data at rest (currently protected by full disk/file encryption)
    2.   Data in transit (currently protected by transport layer security)
    3.   Data in use (now being protected by zero knowledge proofs, multi-party computation and confidential computing)

For years, cloud providers have offered encryption services for protecting data at rest (in storage and databases) and data in transit (moving over a network connection). However, it wasn’t until confidential computing was created that there was a solution for protecting data in use (during processing or runtime).

Intel® SGX

Intel® SGX, the technology behind Conclave, protects data in use using secure enclaves or Trusted Execution Environments (TEEs). It was initially only applied in hardware in the CPU. However with the increasing adoption of cloud computing, the need for this protection to be available on the cloud became more urgent. This is especially true for more sensitive workloads. It was with this awareness the industry realized it needed a name for the conceptand confidential computing was chosen.

With SGX, you can prove exactly how the data will be used. The content of the enclave—the data being processed and the techniques used to process it—are accessible only to the parties submitting the information and invisible to anything or anyone else, including the cloud provider. 

Conclave

SGX addresses the problem of how to protect data in use, but what does Conclave add to the mix? The key problem that Conclave addresses is the fact that implementing and using these technologies usually requires a deep understanding of data security in order to prevent side channel exposure, data leaks, holes in your implementation and other problems. Conclave makes it very easy to develop protected applications without having to hire data security specialists. 

There are other solutions in the market that seek to address this problem. One solution, known as “Lift and Shift,” involves taking a virtual machine and running it in the protected environment. However, when you access the data it will be vulnerable. Therefore, you have to compromise between protection and the need for access. 

Another solution is to develop a custom application or modify your existing application to separate the most important parts of the data and put them into the enclave. The problem with this is that it is very complex. You would need to do it in C++, go through lots of documentation and have a deep understanding of data security. 

In contrast to the above, Conclave provides deep protection in a JVM so you can build in Java, Kotlin or JavaScript. This offers a solution that will be secure and easier to use for the developer. 

By leveraging confidential computing, you will be able to provide technical assurances/cryptographic proof that you are running a real SGX-protected enclave (also known as attestation). With these assurances in hand, you won’t need to rely on a firm’s reputation to determine whether or not you can trust them with your data, as you will have technical attestation of how it is being used. The ability to prove how your data will be processed is making it possible for businesses to collaborate and share more sensitive datasets. 

How ClaimShare leverages the power of Conclave

As discussed, Conclave allows the pooling of sensitive data in a secure, invisible manner. This means companies can pool data together and run mutually beneficial algorithms and processes without exposing customer information or trade secrets to the other parties involved.

One application where this could be useful is in detecting multi-party insurance fraud. This could involve an item being purchased and then insured with multiple companies. The below example is from Richard Gendal Brown, R3’s CTO:

“Say a car is bought for $10,000. The car is then insured at a cost of $500. This insurance will net a compensation of $8,000 in the event the car is written off in an accident. Now more policies are taken out, until we have 10 policies on the same car all with different insurers. 10 policies x $500 = 5,000 + the price of the car = $15,000. Now we crash the car and start claiming all of those policies, 10 x 8,000 = $80,000.” 

This fraud is difficult for insurers to detect because there is no centralized insurance claims database. However, with confidential computing, this type of sensitive data can be pooled in a protected area, processed by verifiable algorithms, and only the fraudulent claims will be highlighted (all other claims will be kept private, even to those running the algorithm). Through collaboration with KPMG, IntellectEU used Conclave to develop ClaimShare, a solution designed to address the problem of duplicate claims in the insurance space. 

Want to learn more?

Here are some helpful resources to learn more about Conclave and Confidential Computing.

July 01, 2021
July 01, 2021

Confidential Computing: Secure Data Analysis on the Cloud with Conclave

Is it possible for organizations to use machine learning and big data technologies without affecting the privacy of users? How could companies derive insights from their data without compromising on security and protect against possible data breaches and fraudulent attacks that could happen in cloud computation? The answer may be confidential computing, an emerging technology… Read more »

Is it possible for organizations to use machine learning and big data technologies without affecting the privacy of users? How could companies derive insights from their data without compromising on security and protect against possible data breaches and fraudulent attacks that could happen in cloud computation? The answer may be confidential computing, an emerging technology that protects data even when in use. This new technique makes it possible for organizations to securely share, pool and process data in the cloud without exposing any of it to the outer world. 

The data lifecycle

With the rise of new data-driven applications, companies are collecting and storing huge amounts of data that would have seemed impossible a decade ago. If most of this data becomes user privacy-related, it becomes crucial to protect and secure the data against attacks and breaches. A data breach could violate GDPR laws in countries where they apply, so a system architect’s most important task is often to identify sensitive data and determine approaches for how to best protect it. Thus, data needs to be protected throughout its lifecycle: at rest, in transit and in use.

Data at rest means inactive data which is stored in any of the following digital forms: databases, data lakes, or other types of storage technology. This data is currently protected using techniques such as tokenization, encryption, and access control, meaning that even during transfer from one database to another, it cannot be breached. 

Data in transit includes any data being moved through the network between applications, servers, or clients. This data is protected from unauthorized access using the TLS/SSL protocol. 

But what about when data is being computed, a.k.a when it is in use? To run any sort of analysis, the data must be in clear text.

The problem—and solution—to protecting data in use

Organizations often need to perform operations on data in use such as search, query, analysis, and machine learning. However to do this, the encrypted data from databases must be decrypted into clear text before it could be used for any sort of computation.

Once decrypted, this clear text data gets exposed to the underlying operating system and the host machine, meaning that any malware application running on the host machine could dump the memory contents and steal sensitive information. So even if your data remains encrypted in storage, it becomes vulnerable to exposure in memory during computation.

When these types of computation are hosted on the cloud, this becomes an even bigger risk. Your data is exposed to the vulnerabilities of the host operating system, hypervisor, hardware, and the cloud provider’s orchestration system. As a result, companies dealing with highly sensitive user data such as credit card details, user information, and KYC documents are often reluctant to host computation on the cloud.

Fortunately, confidential computing solves this problem. This emerging technology isolates sensitive data in a secure enclave, or Trusted Execution Environment (TEE), during processing. By doing so, the contents of the enclave, the data being processed, and the techniques used for computation are accessible only to authorized programming code and invisible to all external parties, including the cloud provider. This enables organizations to share, pool and process sensitive datasets in the cloud, safe in the knowledge that it won’t fall into the wrong hands.

Conclave

R3’s new Conclave platform makes it easy to perform confidential computing either in a machine or in the cloud. The only requirement is support for the Intel SGX enclave. 

With Conclave, you can build applications that securely pool and process data from multiple parties. Conclave-powered solutions are so secure that no one sees the source data without permission—not even the cloud provider. You can see examples of applications you could build with Conclave, as well as tutorials, on our docsite.

Want to learn more about Conclave and Confidential Computing?

June 29, 2021
June 29, 2021

Conclave 1.1: Writing Confidential Computing Apps Just Got a Whole Lot Easier

We’re excited to announce the release of Conclave 1.1, the latest version of our platform that allows fast, simple development of Confidential Computing applications in Java, Kotlin or JavaScript. With this release we have really focused on developer experience, ensuring that we have a good range of sample projects covering a diverse set of applications… Read more »

We’re excited to announce the release of Conclave 1.1, the latest version of our platform that allows fast, simple development of Confidential Computing applications in Java, Kotlin or JavaScript. With this release we have really focused on developer experience, ensuring that we have a good range of sample projects covering a diverse set of applications including machine learning, as well as making big improvements to testing and debugging workflows through use of our improved ‘mock’ mode.

Furthermore, Conclave 1.1 has been tested on the latest 3rd Gen Intel Xeon Scalable processors, allowing for up to 512GB of encrypted memory per CPU.

Before we look at the new features, let’s recap some of the details of Confidential Computing and Conclave.

What problems are we solving?

When it comes to hosting a solution in the cloud, who do your customers trust with their sensitive data? Do they trust you as a software vendor? Do they trust the cloud service provider hosting the application? Do they need to trust that other customers will not get access to their data, or gain a competitive advantage through unauthorized use of that data?

Well the good news is that with Confidential Computing, they do not need to worry about anyone using their data for any purpose other than what the customers approve and authorize.

The way this is achieved is via the use of a hardware-based Trusted Execution Environment (TEE) such as Intel SGX that isolates the code and memory used to process confidential data from the rest of the application. The software vendor can cryptographically prove that when a customer sends data to the application it can only be accessed inside an up-to-date secure TEE executing code that has been approved by the customer themselves.

Now, this sounds simple but in reality there are a lot of things to consider. How does the customer know the data is being processed by a real hardware TEE? How do they know what code is running inside the TEE? What happens if a vulnerability is found in the TEE implementation? How does the software vendor separate the business logic from the sensitive data processing?

There are lots of Confidential Computing domain-specific concepts to understand when solving these problems. Does that mean you need to be an expert in these concepts before you can implement and deploy your Confidential Computing application?

No! Not when using Conclave!

Conclave hides all of this complexity from the developer. You just develop a Java application as normal, making sure to keep your data contained within the ‘enclave’ part of your application. Conclave makes it really easy for the software vendor and the vendor’s customers to check that the enclave is running in an Intel SGX TEE and that the code running inside the enclave is exactly as expected.

As soon as the data leaves the customer environment it is encrypted in transit, at rest and most importantly, in use.

Let’s take a look at those new features in Conclave 1.1 then. We have made a number of improvements that make it easier for developers to get started, develop and test their Conclave applications.

Mock Mode

Firstly, we’ve completely redesigned they way ‘mock’ enclaves work. Conclave 1.0 included a special way of building your enclaves named ‘mock mode.’ This special mode allows you to build and run your enclaves without ever leaving a Java environment. However, with Conclave 1.0 you needed to write code specifically to take advantage of ‘mock mode.’

With Conclave 1.1, ‘mock mode’ has now been fully integrated into the SDK, meaning you can switch between your production build and your mock build with a simple build parameter.

One of the challenges when working with Intel SGX is in testing for scenarios that relate to the state of SGX itself. If a vulnerability is found within SGX, Intel quickly sends down an update to the Trusted Execution Environment. This causes all the encryption keys used by enclaves to be rotated, meaning that any secrets encrypted with the latest version of SGX cannot be read by the potentially vulnerable older version.

But how can a Conclave user test that this is indeed the case?

Well, with the new mock mode, Conclave makes it really easy to simulate changes to the SGX environment including version upgrades and downgrades, allowing tests to be written that check everything works correctly when this happens in a real SGX environment.

R3’s Sneha Damle has written a great blog on the latest mock features in Conclave 1.1.

Samples

Check out our new samples repository! Here you’ll find some great new samples including:

    • A sample showing how to use the Tribuo Java machine learning library in a Conclave enclave providing tools for classification, regression, clustering and model development
    • A sample Event Manager which gives a demonstration of how to host multi-party computations

In addition, you’ll find the CorDapp sample that is bundled with the Conclave SDK has been revamped to show how to integrate Corda network identities with Conclave.

Documentation

We’ve made loads of improvements to the Conclave documentation, ensuring it is accurate, easy to follow and generally really useful. The API documentation has also been given a facelift.

What else?

In addition to all the above, we’ve made loads of small improvements and fixes to make the developer experience better than ever. We hope you’ll agree that our hard work really does make it easy for you to write Confidential Computing applications.

Why don’t you try it out for yourself today?

Download Conclave 1.1 and see just how easy we have made it for you!

June 29, 2021
June 29, 2021

To Mock Or Not To Mock An Enclave

It is an exciting time: Conclave 1.1 is out and loaded with a bunch of new features and enhancements. In this blog I am going to talk about when you should compile and load an enclave in mock mode and how to do it, thus simplifying the development and testing when writing enclaves. Let’s start… Read more »

It is an exciting time: Conclave 1.1 is out and loaded with a bunch of new features and enhancements. In this blog I am going to talk about when you should compile and load an enclave in mock mode and how to do it, thus simplifying the development and testing when writing enclaves. Let’s start by downloading the SDK and the hello-world sample included within it.

Now, let’s talk about what an enclave is and how we use Conclave to build enclave applications. Then, we can explore the different modes you can use when building and running an enclave using Conclave. Finally, we’ll talk about ‘mock’ enclaves.

What is an enclave?

An Intel SGX enclave is a protected region of memory where the code executed and the data accessed is isolated in terms of:

    • Confidentiality: no one has access to the data
    • Integrity: no one can change the code and its behavior

No one can access this memory; not even the processes running at higher levels, the operating system, or even the user himself.

What is the role of Conclave when writing an SGX enclave?

Conclave builds on top of SGX, making it easier to develop enclaves in high-level languages, like Java or Kotlin.

An enclave can be built and loaded as a shared object on Linux OS. When using Conclave, this is managed automatically for you. You develop your application in two parts: the ‘host’ and the ‘enclave.’

The host runs outside the enclave and is responsible for loading the enclave and connecting it with the outside world. The enclave is where you put your code that handles data, keeping it locked away and safe in a small, controlled environment.

The command shown here is an example of how to load and start a Conclave application on a Linux machine

./gradlew host:run

But what if you’re using a Mac or Windows platform?

How to load enclaves on Mac or Windows

Conclave provides you with a container-gradle plugin, which essentially provides a Linux environment to build your enclave using a docker container. You can compile, load, start, and then run a host and enclave using the container-gradle plugin command.

./container-gradle host:run

When building an enclave for Intel SGX, Conclave compiles the code down to a native binary using the native-image capability in GraalVM. The resulting enclave benefits from a faster startup time and lower runtime memory overhead than if we had embedded an alternative JVM inside the enclave.

You might be wondering how to run an enclave if you don’t have an SGX-enabled CPU. This takes us to our next point.

Running an enclave in simulation mode

Conclave provides a ‘simulation’ mode for running enclaves on systems that don’t have an Intel SGX capability. Building and running Conclave applications in simulation mode uses the same process as production enclaves, except that you can run your application on a system that does not support SGX. However, simulation mode still needs to run on a Linux platform.

Again, the container-gradle script comes to our rescue on non-Linux platforms, allowing us to load our simulation enclave using a Linux docker container. If you build the ‘hello-world’ sample provided with the Conclave SDK, then by default when you run the container-gradle script, Conclave loads the enclave in simulation mode.

This mode still compiles your code down to a native binary, which can result in fairly lengthy build times. You might be wondering — are there any other modes in which you can develop your enclave which makes build times as short as possible and debugging your enclave as easy as possible?

Running an enclave in modes other than simulation

Enclaves can also be built and run in debug and release modes. Both require an SGX-enabled CPU.

Debug mode lets you run your enclave code in a real SGX environment but with certain debugging features enabled. For instance, if you write to the console inside an enclave in debug mode, it will be shown in the console output. It is also possible to connect a debugger to a debug mode enclave, however, at present it is not easy to step through Java code using this mode. Although debug enclaves run in a real SGX environment, the debug capability opens the door to accessing data inside the enclave so debug mode enclaves should never be used in production!

Use release mode enclaves when you want to deploy your enclave to production. Release mode enclaves give you the full protection provided by Intel SGX. Unlike with a debug mode enclave, any writes to the console inside the enclave never leave the enclave, which prevents you from accidentally leaking data via the console.

To build an enclave in debug or release mode, specify the mode in container-gradle command for non-Linux systems.

./container-gradle -PenclaveMode=debug host:build

For Linux systems, use the below command:

./gradlew -PenclaveMode=debug host:build

We spoke about building an enclave in release, debug and simulation mode.

Now what if you want to test your enclave logic quickly, without the need to convert the enclave code to a native image? This is where mock mode comes into the picture.

How to run an enclave in mock mode

With the Conclave 1.1 release you can now load your enclave in mock mode. This means the enclave is loaded in the same JVM as that of the host, but no native image is created, no SGX-enabled CPU is required, and the build time is also reduced drastically. In mock mode, calling an enclave function is similar to calling a plain Java function. This is useful when you are continuously changing the enclave logic in the development phase and want to quickly test it. To run your enclave in mock mode, use the same script and provide “mock” as a parameter.

Unlike when using simulation, debug or release modes, the below command will work on macOS and Windows, as well as Linux. You do not need to use the container-gradle script for running mock mode enclaves on non-Linux platforms.

./gradlew -PenclaveMode=mock host:run

Unit testing your Enclave Module using Mock Enclave

To test the enclave logic you can write unit tests in the enclave module. You need to add the host dependency to your enclave build.gradle, as the enclave instance is loaded by the EnclaveHost class present in the host module.

testImplementation "com.r3.conclave:conclave-host"

You can then create an instance of the enclave by calling EnclaveHost.load.

EnclaveHost mockHost = EnclaveHost.load("com.r3.conclave.sample.enclave.ReverseEnclave");
mockHost.start(null, null);

Conclave automatically loads the enclave in mock mode by internally checking if the enclave class file is available on classpath. You can obtain the enclave instance using the EnclaveHost.mockEnclave property and then access the mock enclave’s internals.

ReverseEnclave reverseEnclave = (ReverseEnclave)mockHost.getMockEnclave();

Once you have the enclave handle, you can unit test the business logic in the enclave just as you would test any simple Java function. All the business logic written in your enclave class should ideally be unit tested in the enclave module using the mock enclave mode.

If you run the below command, by default the enclave is loaded in mock mode.

./gradlew enclave:test

Unit testing your Host Module using Mock Enclave

When you want to test your enclave in simulation or on real SGX hardware (debug or release mode), it would make sense to write an integration test in the host module. Use the below command to run the host tests. By default, an enclave is loaded in simulation mode in the host.

You can now take advantage of the fact that an enclave can also be loaded in mock mode from the host test class as shown by the below command. Though I would say, when writing integration tests from the host, you would usually want to load your enclave in simulation mode or on real SGX hardware.

./gradlew -PenclaveMode=mock host:test

MockConfiguration

Sealing is the ability to encrypt and save data outside the enclave and make sure that only the appropriate enclave can decrypt it. When you build an enclave, the resulting enclave can be identified by two ‘measurements:’

    • MRENCLAVE is a hash of the code, static data and layout of the enclave binary and ensures the enclave cannot be modified before it has been loaded.
    • MRSIGNER is a hash of the public part of the key used to sign the enclave.

To encrypt the data, a key is used which is derived either from MRENCLAVE or MRSIGNER, and also version numbers for the software revocation level and the SGX TCB level. When data is sealed using MRENCLAVE, only that enclave can decrypt the data, hence updates to the enclave will leave the data useless. Sealing data to MRSIGNER allows different enclaves to access the data and also allows for enclave upgrades. Enclaves with higher revocation levels or TCB levels can read data sealed to lower levels, but enclaves with lower revocation or TCB levels cannot read data sealed by higher levels. You can test this behavior by passing in the different levels to a mock configuration as shown below.

MockConfiguration mockConfiguration = new MockConfiguration();
mockConfiguration.setRevocationLevel(1);
mockConfiguration.setTcbLevel(2);
mockConfiguration.setProductID(1);
EnclaveHost mockHost = EnclaveHost.load("com.r3.conclave.sample.enclave.ReverseEnclave", mockConfiguration);

Note:

Revocation level is the mock revocation level of the enclave, and gets incremented if a bug in the enclave/Conclave code is fixed.

TCB level is the current security state of the CPU platform identified by Intel. This also includes the CPU SGX microcode version.

Want to learn more?

Below are some helpful resources to learn more about Conclave and Confidential Computing:

June 25, 2021
June 25, 2021

Confidential Computing: What It Is and Why It Matters for Insurance

It’s long been said that data is the new gold. It’s true that modern companies are data driven and data rich. In fact, it’s been reported that Facebook has over 50,000 data points on each of its 2.6 million monthly active users, and Mastercard has over 13,000 data points on individual consumer behavior, global trade… Read more »

It’s long been said that data is the new gold. It’s true that modern companies are data driven and data rich. In fact, it’s been reported that Facebook has over 50,000 data points on each of its 2.6 million monthly active users, and Mastercard has over 13,000 data points on individual consumer behavior, global trade and every rung of commerce in between. “The data, and how we work with the data, is as important as the transactions themselves,” says Mastercard’s President of Operations and Technology, Ed McLaughlin.

Insurance is no different. Without data, insurance would not exist. From the very beginning of the modern insurance industry, data was captured, processed, and shared to enable risks to be understood, priced and transferred. Modern day insurance is no different—data is at the core of insurance decision making, except now insurers have access to more data than ever before. In fact, entire business models have grown up to support the insurance industry to process data and extract insights, making sure insurers provide customers with relevant, affordable, and sustainable products, as well as manage their own business.

However, the challenges around protecting data are complex and onerous. Take, for example, GDPR. The Data Protection Act of 2018 contains 354 pages concerning regulation on processing of information relating to individuals. And the penalty for GDPR violation? A maximum fine of £17.5 million or 4% of annual global turnover, whichever is greater!

Large and notable fines include:

    1. Google (€50m/£43.2m). Google was one of the first companies to be hit by a substantial GDPR fine of €50m in 2019.
    2. H&M (€35.3m/£32.1m)
    3. Tim – Telecom Italia (€27.8m/£24m)
    4. British Airways (£20m)
    5. Marriott International Hotels (£18.4m)

And, according to research from DLA Piper, across Europe in the 12 months leading up to January 27, 2021:

    • GDPR fines rose by nearly 40%
    • Penalties under the GDPR totalled €158.5 million ($191.5 million)
    • Data protection authorities recorded 121,165 data breach notifications (19% more than the previous 12-month period)

Beyond regulation, companies have internal governance and controls relating to data to ensure their own valuable intellectual property remains protected. Data is their lifeblood, used to make daily business decisions or drive key strategic initiatives to benefit customers, employees, and shareholders. In short, data really is gold.

So, how do companies ensure data is protected and will not be misused? The answer: “soft” policy controls.

Internally, companies deploy vast resources to ensure policies and procedures are developed and maintained to ensure everyone acts to keep data protected. The policy controls require all individuals in the organization to “follow the rules” with annual/periodic training and/or certification. Remember your last data protection training course?

These “soft” policy controls, often wrapped up in contractual terms and conditions, are also used when companies send data to 3rd party services that provide analytics.

These are so called “soft” policy controls as they rely on individuals following the rules. This is very difficult to validate, monitor and control, especially with independent 3rd party companies.

On a technical level, data can be kept secure when it’s stored using encryption (data at rest). And when the data is transmitted? That’s encrypted, too (data in transit). But how is data kept secure when it’s being “processed?” And how do you know data is being processed in the “right way?” This has until now been a significant weak point because even with the “soft” policy controls there has been no way of knowing if data is being used in the way the owner of that data intended, and no way to prove that the “processor” can’t see the data or won’t misuse it.

Take the example of filling out an online loan application. You enter the data on the webpage, then the data is sent to the loan provider in a secure encrypted form and stored in their database that’s also encrypted. But you have absolutely no way of knowing how your data is being processed, who it’s being shared with, or that it is secure when being processed. And what applies to your personal data when applying for a loan also applies to your company’s data when sharing it with 3rd parties, be it data analytics service providers, industry bodies, peers, or regulators.

However, new technology known as confidential computing is now available to solve this problem. By closing the loop and providing true end-to-end data privacy, confidential computing ensures that data is strictly isolated during processing and that the data is only analyzed in the agreed-upon way.   The data being processed is invisible and unknowable to anything or anyone else, including the operator of the service and hardware. Cryptographic proof is provided confirming your data is being processed only as agreed.

This new way to process data is enabled by enhancements in hardware from the leading manufacturers such as Intel. This new hardware technology, called Trusted Execution Environments (TEE), brings in new levels which go far beyond the “soft” policy controls that had been used historically. Instead, they use hardware controls that provide technical assurances that the data will be used only as intended. We will go into the technical details in a subsequent blog.

The end result? Individuals and companies can now process and share their precious and valuable data, safe in the knowledge that it is kept secure in transit, at rest, and now, in processing. Crucially, the data is only processed in the agreed way with very secure hardware assurances.

For insurance, the implications of this are numerous:

    1. Insurers can keep sensitive data in more secure environments and protect it from hosting providers or insider threats.
    2. Inbound: Insurers can prove with certainty that they are processing sensitive customer data only in the agreed way (i.e. used for the purposes of risk assessment, fraud detection, or claims payments).
    3. Outbound: Data can be handed over to a 3rd party for analysis confident in the fact that the 3rd party cannot see the raw data and that the data is analyzed in the agreed way.
    4. Data can be pooled with competitors for industry benchmarking or fraud purposes where no one can see the raw data.

In this blog post, we have introduced the new field of hardware controls to secure data during processing and how this can apply to insurance.

In our following blogs, we will discuss the technology behind these hardware controls, Trusted Execution Environments (TEEs) & remote attestation, along with specific insurance use cases and solutions.

Want to learn more?

Below are resources to learn more about Conclave and Confidential Computing:

June 22, 2021
June 22, 2021

How familiar are you with secure computing?

How familiar are you with secure computing? Have you used an enclave before? If you haven’t, then this is the right blog post for you.  Launching a secure service can be a daunting task. You’ve got to make a lot of decisions about system design, infrastructure, data storage policies, regulatory barriers…and of course, cost. It… Read more »

How familiar are you with secure computing? Have you used an enclave before? If you haven’t, then this is the right blog post for you.  Launching a secure service can be a daunting task. You’ve got to make a lot of decisions about system design, infrastructure, data storage policies, regulatory barriers…and of course, cost. It seems like there’s very little a software engineer can trust when it comes to developing technical infrastructure.

It seems like every month we’re made aware of new leaks and vulnerabilities on platforms like SSL. This isn’t just limited to specific organizations or application designs either. The Linux kernel has had over 60 vulnerabilities in 2021 and there’s been another server VMware vulnerability as I write this article.  If you’re threat modeling for sophisticated actors or foreign government operations, it’s probably time to think about swapping your architecture for secure computing platforms. Depending on the circumstances, if an attacker can corrupt the operating system, or worse, the kernel, they can control the computer.

In a perfect industry, these technical issues would be the only ones to worry about—but they are only part of the problem. As developers, it’s easy to think about programs the way that a mathematician might. We have confidence in the software we write because we test it thoroughly. But as we saw above, data can be taken, protocols can be breached, and information can be leaked.

For example, in the securities market, many banks operate “dark pools,” essentially private exchanges for large market participants. One risk to those participants is that the hosts of some of these exchanges will front-run trades in violation of SEC rules. Even with a contract in place, the trust is with the institutions, and that trust is not always well-founded.  The truth is the hosts of certain kinds of data have incentives to misuse it. The computing world has only recently started to grapple with this problem, which is where tools like Conclave come in.

What is an enclave?

An enclave is a secure computing environment for the safe processing of data. Enclaves can verify exactly what algorithm will process the data it’s given. Imagine an ideal CPU that runs specific programs that match the exact hash of the software they were prepared with. In addition, the data going in and out of this enclave is encrypted. The host machine couldn’t read it even if it wanted to.

This special CPU enclave works just like a normal Intel processor with the same set of instructions. In this case it’s implemented by Intel’s software guard extensions (SGX).

Having a secure system like this is great, but it’s not useful to your users if you still must be trusted to host and maintain the system. But what if you could also prove to your users that the code you are hosting does exactly what you claim it does? This is possible for enclaves that support the ability to verify the cryptographic hash of the entire sum of code running within the enclave. Conclave accomplishes this through remote attestation, enabling users to have complete confidence in the software they’re using.

What is the enclave guarding against?

The primary uses for an enclave are instances where the host operating system can’t be trusted, or where an entity maintaining a computer for a particular use case can’t be trusted.

Enclaves protect against both external attacks over the web and attacks from “trusted” places like the kernel, operating system and the hypervisor. The CPU is in theory the only place safe from compromise at every level of the computing stack, and there’s not much that users can do other than switch to software that puts security at the top of its list.

The concept of the enclave assumes that the entire computing environment in which it will be run is hostile. The design of the tool is such that it can’t be run unless it’s within a specific CPU-level enclave (such as Intel SGX).

Physical security and side-channel attacks

A side-channel attack is an attack that gains information using the implementation of a computer system instead of leveraging weaknesses in the software itself. It may not come as a surprise that the physicist’s role is again crucial here. Common attacks are things like cache attacks, power monitoring, timing attacks, acoustic cryptoanalysis and more.

One of the most fascinating examples of this I’ve come across is that theoretically, it’s possible to get into a secure enclave if you have the physical chip and try to read the memory with an electron microscope, but this approach would require serious effort and potentially spoil the data. Of course, the attacker must have the physical device. Ideally, when the chip is manufactured, it creates its own key and uses that for all of its encryption going forward.

We don’t think about this much on the software side, but when it comes to the physical device itself you may wonder how an enclave is secured. From the outset, when data is stored in an enclave it’s encrypted in memory. It is only this enclave that contains the key to read anything at the outset. What a lot of the hardware manufacturers do is design the chip such that it wipes any data on it once it’s physically tampered with. The idea is to store critical information in battery-backed static RAM so it spoils when tampered with or powered off.

You can see more information on this kind of hardware security on page 5 of this Microsemi design document. While I’m no expert, I’d imagine the approach of other semiconductor manufacturers is similar. If you’re curious to learn more about side channel attacks, I’d recommend reading the Conclave docs.

What is Conclave?

Conclave introduces a set of abstractions to enable a developer to interact with an enclave, along with a toolkit for development and emulation of secure computing applications that run within enclaves.

Conclave gives developers some basic abstractions within the architecture. To paraphrase the docs:

  • Clients exchange encrypted messages with enclaves by interacting with the host over the network.
  • Host programs just load enclaves. From a security perspective hosts are assumed to be malicious at all times.
  • Enclaves are classes that are loaded into a dedicated sub-JVM with a protected memory space, running inside the same operating system process as the host JVM. Because they don’t share heaps, the host may only exchange byte buffers with the enclave. Direct method calls also don’t work out of the box: that would require you to add some sort of RPC system on top. In this way it’s similar to interacting with a server over a network, except the enclave is fully local.

With all of this in mind, the architecture diagram below should start to come into focus:

Conclave architecture diagram
Conclave architecture diagram

Where do things go from here?

This is one of the first tools of its kind to be made available for developers. You can use Conclave to build out secure computing applications that have never existed before.

The classic example of this kind of problem is the “zero knowledge proof.” Imagine we wanted to write a program that would tell us which of the two of us has more money? Assuming we didn’t want to reveal our numbers that would actually be quite tricky. One of the nice things about Conclave is it can make solving this kind of problem quite trivial. The entire enclave is simply a machine that compares two numbers. The enclave can be run by either of us, but our numbers are never revealed to each other, only the answer.

This is only an instructive example. You can actually write fully trustable client server programs where you can be sure that the host of a product or service will only use your data in exactly the way you expect. For example, you can isolate secure and private keys from the host machine instead of them being stored in the regular filesystem. Another popular example is private machine learning. Hospitals and other medical groups would be able to exchange medical data to build more robust statistical models without leaking any private or identifying patient information. You can also have provably private searches, such as searching a database where a query is only handled by the enclave, and even the operator of that machine wouldn’t be able to know what the query was.

The use cases really are fascinating, and we’ve only just scratched the surface of what’s possible. If this kind of computing sounds interesting to you, I highly recommend downloading Conclave and trying it for yourself.

Want to learn more?

Here are some additional resources you might find helpful:

June 18, 2021
June 18, 2021

Supply Chain Resilience using Confidential Computing

The pandemic introduced the global public to the very real problems of supply chain continuity. It also changed the way that supply chains approach resilience. Together, these changes are leading to some fundamental shifts in the technologies being used for supply chain management. Of these, confidential computing is one that targets the single most important… Read more »

The pandemic introduced the global public to the very real problems of supply chain continuity. It also changed the way that supply chains approach resilience.

Together, these changes are leading to some fundamental shifts in the technologies being used for supply chain management. Of these, confidential computing is one that targets the single most important underlying problem in supply chain management: how to process business-sensitive data in use. By using confidential computing, it is now possible to protect sensitive data throughout all stages of its lifecycle, as well as provide technical assurances that it can’t be misused. This opens up several opportunities for the supply chain industry to build and take advantage of new collaborative solutions.

To translate this, we will go through two categories of use cases: aligning suppliers and managing products.

Resilience doesn’t come from working harder

Crisis is not new to supply chain management. Earthquakes, port strikes, road closures, cyberattacks and cranky customs officers are par for the course. Normally, these exceptions are local, and so are the solutions. In the beginning of the pandemic, this is exactly what happened. The explosion in job openings for supply chain professionals indicates that many companies immediately responded by working harder.

But as the pandemic has stretched into 2021, supply chains are looking at ways to work smarter.  Unfortunately, this isn’t always easy to do. The old and outdated systems currently being used aren’t designed to establish trust between counterparties or verify that an organization’s data will be protected. However, with the advent of confidential computing, it is now possible to build systems that aggregate data across multiple parties to build new solutions that align supply chain partners.

Below we outline several ideas on how this new collaboration can be implemented in highly innovative ways.

Use case I: Alignment With Supply Chain Partners

  • Cost reduction – Managing day-to-day operations in supplier relationships often necessitates a delicate balance between demands and incentives. Today’s supply chains experience high levels of inefficiency due to the inability to jointly manage cost. Suppliers are often unlikely to share cost data beyond contractual mandates simply because they fear price cuts. Likewise, related costs such as those incurred for transportation and logistics are often not reported, even when they could be optimized through collaboration.
  • Load tendering – The same applies when industry data is aggregated. Load tenders and cost per mile in transportation are instructive examples. Both can be obtained from a variety of sources to analyze not just who is over- or underpaid but also to identify key trends. It is further conceivable that we will witness the emergence of blockchain solutions that initiate transactions such as the procurement of transportation and warehousing services based on bids or auctions. This is a natural evolution and we already observe the emergence of load matching platforms and data aggregators.
  • Scorecarding – Monitoring supplier performance, or Scorecarding, is typically performed in ERP systems today, but it is hard to obtain data that allows for direct comparisons between different suppliers. If scorecards could be securely shared not just across all suppliers of a manufacturer but between several manufacturers, an entirely different picture of supplier performance would likely emerge. It would be possible to extract best-practices that all parties could directly benefit from.

Use case II: Managing Products and Demand

  • Secure data collection – An obvious example where collaboration is easy to achieve is the management of product features, observation of consumer behavior and collection of usage data. Confidential computing allows manufacturers to securely collect data about product usage in the field without intrusion of privacy or data leakage to improve the products they build. The resulting insights are incredibly valuable, especially when they are shared among all relevant parties in a supply chain.
  • Collaborative planning – Today, forecast accuracy typically ranges from 25% – 75%. Demand planning becomes substantially harder the farther we move away from the point of sale, as it is impossible for upstream suppliers to anticipate demand without downstream market knowledge. Leveraging confidential computing means everyone can pool data to derive accurate forecasts while remaining confident they aren’t giving away their competitive advantage.  This brings down inventory levels across the entire chain while also optimizing costs. The more parties that participate, the more benefits there are for everyone.
  • Inventory planning – Suppliers can also record planning, order, inventory, and production data securely on a blockchain and aggregate the data across multiple organizations on a secure confidential computing platform. All participants are assured confidentiality and anonymity in this way while they benefit from the results of analysis. The key to such a solution is that it must be trusted by all parties and guarantee that the underlying data is inaccessible to everyone, including malicious actors.

Can You Afford to Wait?

In 2021, we’re already referring to “the time before.” Solving local exceptions with brute force is part of that time. The emergence of confidential computing as a feasible technology for data sharing and processing means that supply chain management can evolve on a more fundamental level than ever before.

The use cases described here are just a peek at some of the changes that are happening as industry dynamics change with technology. The question may not be whether you can afford to get started today, but rather whether you can afford to wait any longer.

Want to learn more?

Here are some helpful resources to learn more about Confidential Computing and Conclave.

June 14, 2021
June 14, 2021

Remote Attestation: Your X-ray vision into an enclave

Confidential Computing can prove to be a game-changer in enabling multi-party computation without getting into the risk of data leakage or tampering. It would allow multiple enterprises to share confidential information and run various algorithms on it without the risk of their data being seen by each other. If you are new to Confidential Computing… Read more »

Confidential Computing can prove to be a game-changer in enabling multi-party computation without getting into the risk of data leakage or tampering. It would allow multiple enterprises to share confidential information and run various algorithms on it without the risk of their data being seen by each other.

If you are new to Confidential Computing or Conclave, consider taking a look at this article for a brief introduction.

Confidential Computing could lead to huge benefits in various fields—for instance, we can now develop better machine learning models—because of the availability of bigger datasets which was earlier not possible because of the risk of the data being compromised when shared between organizations.

It all comes down to sharing your confidential data with an enclave, where it would be processed and the result would be returned back. All well and good, but how would you know that the enclave in question is really authentic?

Remote Attestation

Remote attestation is that piece of information that helps us to identify the authenticity of an enclave. It is a special data structure that contains the following information:

  • Information indicating that a genuine Intel CPU is running
  • The public key of the enclave
  • A code hash called the measurement
  • Information indicating whether the computer is up-to-date and configured correctly

The most important piece of information that we are interested in here is the measurement. It is a complex hash of the entire module along with its dependencies that is loaded onto the enclave.

Every time a client wants to connect to an enclave and send confidential information for processing, it must first check the remote attestation of the enclave and verify the authenticity of the enclave by comparing the measurement. The remote attestation can be requested from the host.

Below is an example of remote attestation received from the host for an enclave running in simulation mode:

Remote attestation for enclave DB2AF8DD327D18965D50932E08BE4CB663436162CB7641269A4E611FC0956C5F:
— Mode: SIMULATION
— Code signing key hash: 80A866679B567D6B27F5EF9044C13CCB057E761AB8400AD09CC8D70208579611
— Public signing key: 302A300506032B657003210052C7DFDE99D81DF7FF05A2EBED5F8E25FC659A203FAFCA5B07B18CFFD3C5915E
— Public encryption key: F3F02623B55E908C556CE17A13DF385BA621E5D5BCDCDEA8E92E30D4397E0404
— Product ID: 1
— Revocation level: 0
Assessed security level at 2021-05-10T10:09:08.107702Z is INSECURE
– Enclave is running in simulation mode.

Conclave was developed so that any two builds on the same source code should always produce the same measurement. Thus developers can either generate the measurement themselves or rely on a trusted third-party service provider to provide the measurement of the enclave.

Since any update to the source code would change the measurement, it is guaranteed that the enclave does exactly what it says does.

A note on upgrade

It’s pretty evident that any upgrade to the enclave code would result in a change in measurement. This would result in failure since the client would not identify the enclave anymore. A potential solution is to maintain a whitelist of acceptable hashes.

Alternatively, a signing key could be used and as long as the enclave is signed with the key, it could be deemed as authentic.

Want to learn more?

Here are some helpful resources to learn more about Conclave and Confidential Computing.

June 03, 2021
June 03, 2021

A New Era of Privacy-Enhancing Technology Has Arrived

The next frontier for data privacy is fast approaching: according to analyst firm Gartner, by 2025 50% of large organizations will be looking to adopt privacy-enhancing computation (PEC) for processing data in untrusted environments and for multiparty data analytics. PEC is a cross-industry advance that will cause existing data privacy models and techniques to be… Read more »

The next frontier for data privacy is fast approaching: according to analyst firm Gartner, by 2025 50% of large organizations will be looking to adopt privacy-enhancing computation (PEC) for processing data in untrusted environments and for multiparty data analytics. PEC is a cross-industry advance that will cause existing data privacy models and techniques to be radically disrupted, as it offers a new approach to protecting and sharing data across parties without actually revealing that data to anyone.

The appeal of data sharing is clear: sharing data across parties holds the key to unlocking greater analytics and insights, as well as identifying risks and detecting fraud. But if this is the case, why aren’t companies sharing data more freely? The answer is this: they are concerned about the data privacy and security risks that could come from doing so.

Fortunately, the solutions to these concerns are now at hand, with the introduction of Confidential Computing and other privacy-enhancing techniques that put firms in complete control over how their data will be used. To discuss the potential of these new privacy-enhancing technologies, R3’s Chief Technology Officer Richard Gendal Brown recently hosted a webinar where he was joined by two world-leading experts in the field: Michael Klein, Managing Director for Blockchain and Multiparty Systems Architecture at Accenture, and Paul O’Neill, Senior Director of Strategic Business Development at Intel.

Setting the scene, Richard mapped out the discussion in three stages: first, by scoping out the business problems around privacy that traditional technology can’t solve; second, by looking at some of the new technological approaches such as PEC that can solve these problems; and third, by examining how these technologies can actually be applied. According to Richard, “this isn’t a future-looking phenomenon. This is a collection of technologies that can be applied right now.”

So, what exactly is the business problem? Paul O’Neill of Intel said that looking across different industries – especially highly-regulated sectors such as healthcare and finance – the biggest challenge has been the rise of “incentivized collaboration.”

“Imagine you’re a hospital administrator, and you’re going to submit sensitive patient data and healthcare records to a research firm that’s going to perform a clinical trial with the patient’s consent,” Paul explained. “You desperately want to advance medical science. But as an enterprise, you’re worried. What happens if a rogue employee at the research firm steals that data? What if the research firm is using your patient’s data in a way that they didn’t agree to? To anybody involved in privacy, that’s really, really scary.”

What’s needed is a way for firms to know that their data remains protected at all times in a way that a third party cannot observe or even copy it – which is what technologies like Confidential Computing enable. However, these technologies are perceived as complex, and a recurrent theme during the debate was how to cut through this complexity to get to the core business issues. Accenture’s Michael Klein commented: “There are many techniques to encrypt data in use. Some are completely software-based, while some are hardware-based. And we can talk [to clients] about who they are actually trusting. Are they trusting the creator of the software or the creator of the hardware? And then, what are the features that the technique enables, and how ready is it to scale? I think those questions are probably the two biggest things that we encounter as we introduce our privacy-preserving functions or computations: helping our clients to understand that these are all valid techniques, and then choose the one that’s going to best fit their scenario and also scale to meet their needs.”

There isn’t room in this short blog to go into the full richness of the debate. To experience it, click here to watch the webinar recording in-full.

Want to learn more?

Here are some helpful resources to learn more about PEC, Confidential Computing and Conclave.

  • Hear from Gartner on why PEC is a top strategic technology for 2021 in a recent report.
  • Want to learn more about Conclave? Read Richard Brown’s recent blog post titled, “Introducing Conclave.”
  • Are you an app builder? Try a free 60 day trial of Conclave today.