Skip to content
This repository has been archived by the owner on Oct 22, 2024. It is now read-only.
This repository has been archived by the owner on Oct 22, 2024. It is now read-only.

support PMEM inside Kata Containers #303

Closed
@pohly

Description

Applications running inside Kata Containers cannot use PMEM in App Direct mode because they don't get access to the original filesystem.

One idea for addressing this is to:

  • map the entire partition into memory
  • start QEMU such that it makes that memory range available inside the virtual machine
  • mount that memory range inside the virtual machine

Details to be decided, and mostly has to be handled in Kata Containers...

Activity

pohly

pohly commented on Jun 6, 2019

@pohly
ContributorAuthor

To reproduce the problem inside our QEMU virtual cluster, nested virtualization is needed. We also need changes to install and use Kata Containers. I have all of that in a branch:
https://github.com/pohly/pmem-CSI/commits/nested-virtualization

kiendinh

kiendinh commented on Sep 26, 2019

@kiendinh

Kernel 5.3.1 has been equipped with ClearLinux version 31090, and I've seen that VIRTIO_PMEM is there. Can we expect to see PMEM-CSI on Kata soon?

pohly

pohly commented on Oct 11, 2019

@pohly
ContributorAuthor

My next step will be to try out Kata with virtiofs support, which will be released shortly. This might support volume-passthrough with fsdax.

pohly

pohly commented on Nov 25, 2019

@pohly
ContributorAuthor

kata-containers >= 1.9.0 has support for virtiofs builtin (https://github.com/kata-containers/documentation/blob/master/how-to/how-to-use-virtio-fs-with-kata.md) when using the kata-qemu-virtiofs runtime class (https://raw.githubusercontent.com/kata-containers/packaging/master/kata-deploy/k8s-1.14/kata-qemu-virtiofs-runtimeClass.yaml):

diff --git a/deploy/common/pmem-app-ephemeral.yaml b/deploy/common/pmem-app-ephemeral.yaml
index aca6bda6..28bf9220 100644
--- a/deploy/common/pmem-app-ephemeral.yaml
+++ b/deploy/common/pmem-app-ephemeral.yaml
@@ -5,6 +5,7 @@ apiVersion: v1
 metadata:
   name: my-csi-app-inline-volume
 spec:
+  runtimeClassName: kata-qemu-virtiofs
   containers:
     - name: my-frontend
       image: busybox

However, virtio-fs turned out to be not suitable for PMEM:

  • It does not map all pages at once. Instead, it maintains a cache of mapped pages which is considerably smaller ("a few GB") than the available PMEM. This should lead to lower performance.
  • Because a page might not be currently mapped when written to, it does not meet MAP_SYNC requirements.

I have engaged with the Kata Container folks here: kata-containers/runtime#2262

self-assigned this
on Dec 10, 2019
pohly

pohly commented on Jan 14, 2020

@pohly
ContributorAuthor

Functional PoC in #500, now we need the corresponding changes in Kata Containers.

pohly

pohly commented on May 8, 2020

@pohly
ContributorAuthor

Kata Containers will have support in 1.11.0 (currently available as -rc0). PR #500 contains an E2E tst with Kata Containers, but it's still WIP and doesn't pass.

One problem is that by default, Kata Containers only allows VMs to have as much memory as the host has DRAM. If the host than wants to add a much larger PMEM volume, Kata Containers fails with something like:

Error: container create failed: QMP command failed: not enough space, currently 0x8000000 in use of total space for memory devices 0x3c100000

One solution is to edit /opt/kata/share/defaults/kata-containers/configuration-qemu.toml after kata-deploy created it and increase memory_offset: https://github.com/kata-containers/runtime/blob/master/cli/config/configuration-qemu.toml.in#L91

Alternatively, that limit can be raised individually for each pod:
https://github.com/kata-containers/documentation/blob/master/how-to/how-to-set-sandbox-config-kata.md

pohly

pohly commented on May 18, 2020

@pohly
ContributorAuthor

Should work now in "devel" (PR #500), but not tested in CI. Need to test once manually, then close this issue.

added
0.7needs to be fixed in 0.7.x
on May 18, 2020
pohly

pohly commented on Jun 10, 2020

@pohly
ContributorAuthor

Manual testing found a regression which then was fixed. Works now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

Labels

0.7needs to be fixed in 0.7.x

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions

    support PMEM inside Kata Containers · Issue #303 · intel/pmem-csi