Proof with `AnySlice` does not terminate, memory grows unbounded #2634

arctic-alpaca · 2023-07-28T19:41:34Z

Hi,
I'm using kani to verify the absence of panics in some network parsing/manipulation code (without unsafe).
With some manipulation functions, using AnySlice leads to Kani not terminating and CBMC's memory usage growing until it's killed to prevent OOM.
Depending on the function, the memory grows at different speeds but the result is the same. Kani/CBMC does not terminate.

Switching from AnySlice to an array with kani::any() works without issue but does not verify the same properties.

The reproduction code contains three proofs, one using kani::any() and two using AnySlice. The working proof with AnySlice shows what I initially did to verify the code does not panic. To also verify that the manipulating method produces a valid state, the buffer is parsed (new()) again. Adding this second parsing triggered the non-terminating behavior. I did encounter this behavior before in other situations but attributed it to being too complex to use AnySlice so I reduced the scope of the proof. With this, reducing the scope is not possible without losing some property of the proof.

Different verifiers or using loop unwinding lead to the same outcome.

The reproduction is not really minimal but I hope this is alright. The memory usage in this example increases very slowly (over multiple hours it slowly used up over 16GiB) but it was the easiest to extract. If helpful, I can link my whole code with a proof where it takes less than 5 min to reach OOM kill.

pub struct NewDataBufferMut<B>
where
    B: AsRef<[u8]> + AsMut<[u8]>,
{
    pub(crate) buffer: B,
    data_start_offset: usize,
    header_start_offset: usize,
    header_length: usize,
}

impl<B: AsRef<[u8]> + AsMut<[u8]>> NewDataBufferMut<B> {
    pub fn new(buf: B, data_start_offset: usize) -> Result<NewDataBufferMut<B>, ()> {
        let data_length = buf.as_ref().len().saturating_sub(data_start_offset);

        if data_length < 3 {
            return Err(());
        }

        let mut result = Self {
            buffer: buf,
            data_start_offset,
            header_start_offset: data_start_offset,
            header_length: data_length,
        };

        let two_control_bytes = result.ieee_802_2_has_two_control_bytes() as usize;

        let header_length = if result.ieee_802_2_has_snap_extension() {
            8 + two_control_bytes
        } else {
            3 + two_control_bytes
        };

        if data_length >= header_length {
            result.header_length = header_length;
        } else {
            return Err(());
        }

        Ok(result)
    }

    #[inline]
    pub fn into_inner(self) -> B {
        self.buffer
    }

    #[inline]
    fn ieee_802_2_has_two_control_bytes(&self) -> bool {
        (self.data_buffer_starting_at_header()[2] & 0b0000_0011) != 0b0000_0011
    }

    #[inline]
    fn ieee_802_2_has_snap_extension(&self) -> bool {
        self.data_buffer_starting_at_header()[0..2] == [0xAA, 0xAA]
    }

    #[inline]
    pub fn cut_ieee_802_2_snap(&mut self, dsap: u8, ssap: u8) -> Result<(), ()> {
        if self.ieee_802_2_has_snap_extension() {
            if dsap == 0xAA && ssap == 0xAA {
                Err(())
            } else {
                self.data_buffer_starting_at_header_mut()[0..2].copy_from_slice(&[dsap, ssap]);

                let two_control_bytes = self.ieee_802_2_has_two_control_bytes() as usize;
                self.shrink_header(3 + two_control_bytes, 5);
                Ok(())
            }
        } else {
            Err(())
        }
    }

    #[inline]
    fn shrink_header(&mut self, current_header_bytes_to_move: usize, shrink_by: usize) {
        let start = self.data_start_offset();
        let end = self.header_start_offset() + current_header_bytes_to_move;
        let destination = self.data_start_offset() + shrink_by;
        *self.header_length_mut() -= shrink_by;
        *self.header_start_offset_mut() += shrink_by;
        *self.data_start_offset_mut() += shrink_by;
        self.buffer_mut().copy_within(start..end, destination);
    }

    #[inline]
    fn data_start_offset_mut(&mut self) -> &mut usize {
        &mut self.data_start_offset
    }

    #[inline]
    fn header_start_offset_mut(&mut self) -> &mut usize {
        &mut self.header_start_offset
    }

    #[inline]
    fn header_length_mut(&mut self) -> &mut usize {
        &mut self.header_length
    }

    #[inline]
    fn data_start_offset(&self) -> usize {
        self.data_start_offset
    }

    #[inline]
    fn header_start_offset(&self) -> usize {
        self.header_start_offset
    }

    #[inline]
    fn data_buffer_starting_at_header(&self) -> &[u8] {
        &self.buffer.as_ref()[self.header_start_offset()..]
    }
    #[inline]
    fn data_buffer_starting_at_header_mut(&mut self) -> &mut [u8] {
        let header_start_offset = self.header_start_offset();
        &mut self.buffer.as_mut()[header_start_offset..]
    }
    #[inline]
    fn buffer_mut(&mut self) -> &mut [u8] {
        &mut self.buffer.as_mut()[..]
    }
}

#[cfg(kani)]
mod verification {

    use super::*;

    const SLICE_LENGTH: usize = 60;
    const DATA_START_OFFSET: usize = SLICE_LENGTH + 10;

    #[kani::proof]
    fn works() {
        let mut data: [u8; 100] = kani::any();
        let data_start_offset = kani::any();
        kani::assume(data_start_offset < 110);

        if let Ok(mut to_test) = NewDataBufferMut::<_>::new(&mut data, data_start_offset) {
            let _ = to_test.cut_ieee_802_2_snap(kani::any(), kani::any());
            NewDataBufferMut::<_>::new(to_test.into_inner(), data_start_offset).unwrap();
        }
    }

    #[kani::proof]
    fn also_works() {
        let mut slice: kani::slice::AnySlice<u8, SLICE_LENGTH> = kani::slice::any_slice();
        let data_start_offset = kani::any();
        kani::assume(data_start_offset < DATA_START_OFFSET);

        if let Ok(mut to_test) =
            NewDataBufferMut::<_>::new(slice.get_slice_mut(), data_start_offset)
        {
            let _ = to_test.cut_ieee_802_2_snap(kani::any(), kani::any());
        }
    }

    #[kani::proof]
    fn does_not_work() {
        let mut slice: kani::slice::AnySlice<u8, SLICE_LENGTH> = kani::slice::any_slice();
        let data_start_offset = kani::any();
        kani::assume(data_start_offset < DATA_START_OFFSET);

        if let Ok(mut to_test) =
            NewDataBufferMut::<_>::new(slice.get_slice_mut(), data_start_offset)
        {
            let _ = to_test.cut_ieee_802_2_snap(kani::any(), kani::any());
            // Parse data again to verify manipulated data is still valid.
            NewDataBufferMut::<_>::new(to_test.into_inner(), data_start_offset).unwrap();
        }
    }
}

The text was updated successfully, but these errors were encountered:

zhassan-aws · 2023-07-28T22:02:45Z

Thanks @arctic-alpaca for the bug report and the testcase. I'm indeed noticing a memory blowup with AnySlice even on simple testcases with smaller slice size bounds. For example, on the following test, memory usage exceeds 12 GB, and it runs for 30 minutes without terminating:

extern crate kani;
use kani::slice;

fn min(s: &[i32]) -> Option<i32> {
    s.iter().min().copied()
}

#[kani::proof]
#[kani::unwind(31)]
#[kani::solver(cadical)]
fn main() {
    let s: slice::AnySlice<i32, 30> = slice::any_slice();
    let res = min(&s);

    assert!(s.len() == 0 || res.is_some());
    if let Some(res) = res {
        let index: usize = kani::any_where(|i| *i < s.len());
        assert!(res <= s[index]);
    }
}

feliperodri · 2023-07-31T21:42:30Z

This performance issue might not be directly related to AnySlice. I modified @zhassan-aws example to use any_array() hoping for better performance, but I still get the same behavior.

extern crate kani;

fn min(s: &[i32]) -> Option<i32> {
    s.iter().min().copied()
}

#[kani::proof]
#[kani::unwind(31)]
#[kani::solver(cadical)]
fn main() {
    let s: [i32; 30] = kani::Arbitrary::any_array::<30>();
    let res = min(&s);

    assert!(s.len() == 0 || res.is_some());
    if let Some(res) = res {
        let index: usize = kani::any_where(|i| *i < s.len());
        assert!(res <= s[index]);
    }
}

Only adding the last assertion causes CBMC to get stuck in propositional reduction for a while and it triggers the performance bottleneck.

arctic-alpaca · 2023-08-01T08:27:54Z

Adding a proof with any_array() to my code in #2634 (comment) does not show the problematic behavior. Maybe there are different issues or different triggers for one issue?

#[kani::proof]
fn works1() {
    let mut data: [u8; 100] = kani::Arbitrary::any_array::<100>();
    let data_start_offset = kani::any();
    kani::assume(data_start_offset < 110);

    if let Ok(mut to_test) = NewDataBufferMut::<_>::new(&mut data, data_start_offset) {
        let _ = to_test.cut_ieee_802_2_snap(kani::any(), kani::any());
        NewDataBufferMut::<_>::new(to_test.into_inner(), data_start_offset).unwrap();
    }
}

arctic-alpaca · 2023-09-12T09:42:12Z

I might have found two possible work-arounds for this issue while AnySlice does not terminate:

For loop over AnyArray:

#[kani::proof]
fn works_for_loop() {
  let data: [u8; SLICE_LENGTH] = kani::Arbitrary::any_array::<SLICE_LENGTH>();
  let data_start_offset = kani::any();
  kani::assume(data_start_offset < DATA_START_OFFSET);

  for i in 0..SLICE_LENGTH {
      let mut working_data = data.clone();
      if let Ok(mut to_test) =
          NewDataBufferMut::<_>::new(&mut working_data[..i], data_start_offset)
      {
          let _ = to_test.cut_ieee_802_2_snap(kani::any(), kani::any());
          NewDataBufferMut::<_>::new(to_test.into_inner(), data_start_offset).unwrap();
      }
  }
}

This code should verify the same things AnySlice would verify as it iterates over all possible slice lenghts as parameter with an arbitrary array. Whether this is feasible or not is directly related to the used SLICE_LENGTH. In this example, 20 does terminate within a somewhat reasonable time (112s on my system), higher values less so.

any_vec() with as_slice() and as_mut_slice():

#[kani::proof]
fn works_vec() {
    let mut vec = kani::vec::any_vec::<u8, SLICE_LENGTH>();
    let mut slice = vec.as_mut_slice();
    let data_start_offset = kani::any();
    kani::assume(data_start_offset < DATA_START_OFFSET);

    if let Ok(mut to_test) = NewDataBufferMut::<_>::new(slice, data_start_offset) {
        let _ = to_test.cut_ieee_802_2_snap(kani::any(), kani::any());
        NewDataBufferMut::<_>::new(to_test.into_inner(), data_start_offset).unwrap();
    }
}

This does terminate quite quickly (~1s on my system) and if I understand any_vec() correctly, this should verify the same properties as AnySlice.

I'd appreciate feedback whether the verified properties are actually equivalent to AnySlice or whether I'm overlooking something.

zhassan-aws · 2023-09-12T20:19:08Z

For loop over AnyArray:

If you want it to cover the case where the slice spans the whole array, the loop needs to include SLICE_LENGTH as well:

  for i in 0..=SLICE_LENGTH {

Otherwise, it looks equivalent. You may also replace the loop with a non-deterministic slice, e.g.:

  let data: [u8; SLICE_LENGTH] = kani::Arbitrary::any_array::<SLICE_LENGTH>();
  let end: usize = kani::any_where(|e| *e <= SLICE_LENGTH);
  let slice = data[..end];

Not sure how Kani would perform on this harness compared to the loop-based one. For the record, this is how the old implementation of AnySlice worked, but it was replaced with a dynamic allocation based implementation to avoid a potential unsoundness.

2. any_vec() with as_slice() and as_mut_slice():

This one should be equivalent.

arctic-alpaca · 2023-09-13T07:10:54Z

the loop needs to include SLICE_LENGTH as well

I missed that, thanks.

This one should be equivalent.

That's perfect! It terminates with much more complex proofs than AnySlice allowed for.

Out of curiosity, wouldn't it be possible to replace the AnySlice implementation with something similar to any_vec().as_slice() (after #2759 is fixed)?

zhassan-aws · 2023-09-13T21:38:21Z

That's perfect! It terminates with much more complex proofs than AnySlice allowed for.

With the caveat mentioned in #2759 (which you already noticed).

Out of curiosity, wouldn't it be possible to replace the AnySlice implementation with something similar to any_vec().as_slice() (after #2759 is fixed)?

That would make sense if it continues to have superior performance even after fixing #2759.

arctic-alpaca · 2023-09-20T19:53:27Z

With the release of 0.37.0, the works_vec() proof from #2634 (comment) does not seem to terminate anymore. The memory is growing seemingly unbounded. I'm guessing this is caused by #2765, but I cannot dig deeper into this atm. I originally noticed this because the verifications in https://github.com/arctic-alpaca/mutnet started causing OOM issues.

zhassan-aws · 2023-09-20T21:47:12Z

Indeed :(

If I revert the change to any_vec:

+++ b/library/kani/src/vec.rs
@@ -14,9 +14,10 @@ pub fn any_vec<T, const MAX_LENGTH: usize>() -> Vec<T>
         exact if exact == MAX_LENGTH => exact_vec::<T, MAX_LENGTH>(),
         _ => {
             let mut any_vec = exact_vec::<T, MAX_LENGTH>();
-            any_vec.truncate(real_length);
-            any_vec.shrink_to_fit();
-            assert!(any_vec.capacity() == any_vec.len());
+            //any_vec.truncate(real_length);
+            //any_vec.shrink_to_fit();
+            unsafe { any_vec.set_len(real_length) }
+            //assert!(any_vec.capacity() == any_vec.len());
             any_vec
         }
     }

the proof goes through quickly.

If you're not worried about the code making illegal accesses to the slice, you can explicitly create the slice, e.g:

   #[kani::proof]
    fn works_slices() {
        let mut arr: [u8; SLICE_LENGTH] = kani::any();
        let nd_index = kani::any_where(|i| *i <= SLICE_LENGTH);
        let slice = &mut arr[..nd_index];
        let data_start_offset = kani::any();
        kani::assume(data_start_offset < DATA_START_OFFSET);

        if let Ok(mut to_test) =
            NewDataBufferMut::<_>::new(slice, data_start_offset)
        {
            let _ = to_test.cut_ieee_802_2_snap(kani::any(), kani::any());
            NewDataBufferMut::<_>::new(to_test.into_inner(), data_start_offset).unwrap();
        }
    }

arctic-alpaca · 2023-09-24T16:37:37Z

If you're not worried about the code making illegal accesses to the slice, you can explicitly create the slice

Thanks, that's what I'm going to do. Luckily the unsoundness issues doesn't seem to affect safe code. Accessing an index out of slice bounds still fails the proof as expected.

Deprecate kani::slice::any_slice in Kani 0.38.0 (and the related AnySlice struct), and remove it in a later version. Motivation: The Kani library's `slice::any_slice` API is doing more harm than benefit: it is meant to provide a convenient way for users to generate non-deterministic slices, but its current implementation aims to avoid any unsoundness by making sure that for the non-deterministic slice returned by the API, accessing memory beyond the slice length is flagged as an out-of-bounds access (see #1009 for details). However, in practical scenarios, using it ends up causing memory to blowup for CBMC. Given that users may not care about this type of unsoundness, which is only possible through using a pointer dereference inside an unsafe block (see discussion in #2634). Thus, we should leave it to the user to decide the suitable method for generating slices that fits their verification needs. For example, they can use Kani's alternative API, `any_slice_of_array` that extracts a slice of a non-deterministic start and end from a given array.

zhassan-aws · 2023-09-26T06:10:43Z

Sounds good! We've deprecated AnySlice in #2789 to avoid such confusions in the future. Thanks for bringing up this issue!

arctic-alpaca added the [C] Bug This is a bug. Something isn't working. label Jul 28, 2023

zhassan-aws added [E] Performance Track performance improvement (Time / Memory / CPU) T-User Tag user issues / requests labels Jul 28, 2023

zhassan-aws added the T-High Priority Tag issues that have high priority label Sep 22, 2023

celinval assigned zhassan-aws Sep 22, 2023

This was referenced Sep 25, 2023

Deprecate kani::slice::any_slice #2788

Closed

Deprecate any_slice #2789

Merged

Performance is significantly worse with Vec::with_capacity #2790

Closed

zhassan-aws closed this as completed in #2789 Sep 26, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Proof with `AnySlice` does not terminate, memory grows unbounded #2634

Proof with `AnySlice` does not terminate, memory grows unbounded #2634

arctic-alpaca commented Jul 28, 2023

zhassan-aws commented Jul 28, 2023

feliperodri commented Jul 31, 2023

arctic-alpaca commented Aug 1, 2023 •

edited

Loading

arctic-alpaca commented Sep 12, 2023

zhassan-aws commented Sep 12, 2023

arctic-alpaca commented Sep 13, 2023

zhassan-aws commented Sep 13, 2023

arctic-alpaca commented Sep 20, 2023

zhassan-aws commented Sep 20, 2023

arctic-alpaca commented Sep 24, 2023

zhassan-aws commented Sep 26, 2023 •

edited

Loading

Proof with AnySlice does not terminate, memory grows unbounded #2634

Proof with AnySlice does not terminate, memory grows unbounded #2634

Comments

arctic-alpaca commented Jul 28, 2023

zhassan-aws commented Jul 28, 2023

feliperodri commented Jul 31, 2023

arctic-alpaca commented Aug 1, 2023 • edited Loading

arctic-alpaca commented Sep 12, 2023

zhassan-aws commented Sep 12, 2023

arctic-alpaca commented Sep 13, 2023

zhassan-aws commented Sep 13, 2023

arctic-alpaca commented Sep 20, 2023

zhassan-aws commented Sep 20, 2023

arctic-alpaca commented Sep 24, 2023

zhassan-aws commented Sep 26, 2023 • edited Loading

Proof with `AnySlice` does not terminate, memory grows unbounded #2634

Proof with `AnySlice` does not terminate, memory grows unbounded #2634

arctic-alpaca commented Aug 1, 2023 •

edited

Loading

zhassan-aws commented Sep 26, 2023 •

edited

Loading