-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
network: Extract setup and status update logic from the virt-handler package #6781
Conversation
/sig network |
/retest |
e593696
to
c3e189a
Compare
/test all |
c3e189a
to
796b942
Compare
796b942
to
de6e3d3
Compare
de6e3d3
to
5284191
Compare
eb0f2a8
to
1d5a0b1
Compare
/assign @maiqueb |
Internal logic of the network setup is decoupled from the virt-handler unit tests by using a stub. The existing tests can check now that the network setup is invoked as before, but without the need to know the internals of how the network setup works. Signed-off-by: Edward Haas <edwardh@redhat.com>
Signed-off-by: Edward Haas <edwardh@redhat.com>
Signed-off-by: Edward Haas <edwardh@redhat.com>
Signed-off-by: Edward Haas <edwardh@redhat.com>
Signed-off-by: Edward Haas <edwardh@redhat.com>
Signed-off-by: Edward Haas <edwardh@redhat.com>
Keep a test that checks the scenario where the network status update is called from the main flow. Signed-off-by: Edward Haas <edwardh@redhat.com>
For consistency, the (semi) reverse operation of the setup is now called teardown. The method rename is introduced with documentation to clarify that the teardown involves only the cache cleanup, and not the removal of the network entities (these are removed as part of the pod/container destruction). Signed-off-by: Edward Haas <edwardh@redhat.com>
dadff7e
to
75b0bd4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Commit 1 review
} | ||
|
||
func (c *Controller) Teardown(vmi *v1.VirtualMachineInstance, do func() error) error { | ||
c.setupCompleted.Delete(vmi.UID) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems to be safer to move the Delete
to be after the do
. Is there a special reason for the current order?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Once teardown is called, the cache should be cleared even if the operation partially fails. Otherwise, we will have leftovers.
But again, this is following the current logic.
return fmt.Errorf("setup failed, err: %w", err) | ||
} | ||
|
||
c.setupCompleted.Store(id, struct{}{}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since sync.Map
is used I understand the code here should be thread safe. The suggested code is not thread safe. Multiple threads can perform the setup simultaneously.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is mimicking the existing code, nothing logically changed.
The thread-safety if for the cache itself, as multiple threads can access that "database". The setup itself is thread safe as only one VMI can be treated at a time, so there is nothing special to protect here.
return neterrors.CreateCriticalNetworkError(fmt.Errorf("failed to set up vhost-net device, %s", err)) | ||
return d.networkController.Setup(vmi, isolationRes.Pid(), isolationRes.DoNetNS, func() error { | ||
if requiresDeviceClaim { | ||
if err := d.claimDeviceOwnership(rootMount, "vhost-net"); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why did you move rootMount := isolationRes.MountRoot()
outside the if
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In order not to bind (called closure) the whole isolation object to the function, just the needed data. I wanted to take only the minimum needed.
@@ -591,7 +597,6 @@ var _ = Describe("VirtualMachineInstance", func() { | |||
Expect(mockQueue.GetRateLimitedEnqueueCount()).To(Equal(0)) | |||
_, err := os.Stat(mockWatchdog.File(vmi)) | |||
Expect(os.IsNotExist(err)).To(BeFalse()) | |||
Expect(controller.phase1NetworkSetupCache.Size()).To(Equal(1)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why didn't substitute with Expect(controller.networkController.SetupCompleted(vmi)).To(BeTrue())
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I saw no reason why we do need to check the state of the network cache in this test.
Do you see any reason for it?
@@ -664,7 +669,6 @@ var _ = Describe("VirtualMachineInstance", func() { | |||
mockHotplugVolumeMounter.EXPECT().Mount(gomock.Any()).Return(nil) | |||
controller.Execute() | |||
testutils.ExpectEvent(recorder, VMIDefined) | |||
Expect(controller.phase1NetworkSetupCache.Size()).To(Equal(1)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same answer, I do not see why we need to check this in this test context. The creation of the domain should be checked, not if there network setup was invoked.
There are dedicated tests for networking.
@@ -2582,8 +2585,7 @@ var _ = Describe("VirtualMachineInstance", func() { | |||
PodIP: podIPs[0], | |||
PodIPs: podIPs, | |||
} | |||
err = controller.networkCacheStoreFactory.CacheForVMI(vmi).Write(interfaceName, podCacheInterface) | |||
Expect(err).ToNot(HaveOccurred()) | |||
controller.networkController.CachePodInterfaceVolatileData(vmi, interfaceName, podCacheInterface) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The original test used the non-volatile cache. Do we have unit test coverage for it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The non-volatile was used to inject data to see if reports the right thing. I do not think the intention was to check this cache, but to check that the status update does its thing.
So replacing it with the volatile data is both faster and closer to the intention here.
Per what I saw, how the non-volatile cache is used is already covered by other tests, like the ones under https://github.com/kubevirt/kubevirt/blob/main/pkg/network/setup/network_test.go .
Looking at the reporting/status part, at the end of the refactoring some scenarios are checked against the volatile cache and some against the non-volatile one.
But I think this is a small part, as the non-volatile and volatile caches interactions is simple. Maybe it is worth adding a specific test that checks that the non-volatile cache is populated in the volatile cache on first usage.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have added a test to cover the last point here.
|
||
c.networkController = netsetup.NewController(c.networkCacheStoreFactory) | ||
c.networkController = netsetup.NewController(netcache.NewInterfaceCacheFactory()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you pass the cache factory as a parameter to the NewController
and not initialize it inside the controller?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same reason it was done until now... it needs to be stubbed/mocked in the tests with one that does not use the filesystem. See here.
/approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: AlonaKaplan The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/retest |
2 similar comments
/retest |
/retest |
/retest |
/retest |
2 similar comments
/retest |
/retest |
/retest |
/retest |
What this PR does / why we need it:
Decouple network setup and status logic out of the virt-handler.
The network setup is now handled by the
NetConf
object and the status updates byNetStat
.Production code and tests moved under the
network
package, leaving only a slim check that both setup and status are actually called in the overall flow.Which issue(s) this PR fixes (optional, in
fixes #<issue number>(, fixes #<issue_number>, ...)
format, will close the issue(s) when PR gets merged):Fixes #
Special notes for your reviewer:
Following PR/s should continue and:
NentConf/Setup
signature by embeddingDoNetNS
into the setup itself.vm.go
(e.g.checkNetworkInterfacesForMigration
).Release note: