Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Illegal read in do_is_ordered #236

Closed
TomAndrews opened this issue Apr 18, 2018 · 11 comments
Closed

Illegal read in do_is_ordered #236

TomAndrews opened this issue Apr 18, 2018 · 11 comments

Comments

@TomAndrews
Copy link
Contributor

Description

In the process of trying to track down the cause of #234 I ran some jobs through valgrind and it highlighted a jump based on an uninitialized value here:

https://github.com/joshuaulrich/xts/blob/master/src/isOrdered.c#L80

I don't think do_is_ordered checks that the length of x is at least 1 before checking whether the first value is NA.

I guess there could be a similar problem here:
https://github.com/joshuaulrich/xts/blob/master/src/isOrdered.c#L107
I think nx could be -1 so int_x[nx] may be an illegal read.

@joshuaulrich
Copy link
Owner

I'm not able to replicate this with a simple example. I assumed a zero-length integer() vector would trigger the behavior.

Can you provide more information about the specific isOrdered() call?

> R --quiet -d "valgrind" -e 'xts::isOrdered(integer())'
==3508== Memcheck, a memory error detector
==3508== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==3508== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright info
==3508== Command: /usr/lib/R/bin/exec/R --quiet -e xts::isOrdered(integer())
==3508==
R> xts::isOrdered(integer())
[1] TRUE
R>
R>
==3508==
==3508== HEAP SUMMARY:
==3508==     in use at exit: 40,814,488 bytes in 18,669 blocks
==3508==   total heap usage: 37,365 allocs, 18,696 frees, 73,276,345 bytes allocated
==3508==
==3508== LEAK SUMMARY:
==3508==    definitely lost: 0 bytes in 0 blocks
==3508==    indirectly lost: 0 bytes in 0 blocks
==3508==      possibly lost: 0 bytes in 0 blocks
==3508==    still reachable: 40,814,488 bytes in 18,669 blocks
==3508==         suppressed: 0 bytes in 0 blocks
==3508== Rerun with --leak-check=full to see details of leaked memory
==3508==
==3508== For counts of detected and suppressed errors, rerun with: -v
==3508== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

@TomAndrews
Copy link
Contributor Author

I foolishly didn't save the valgrind output when my attempted fix removed the valgrind warning but didn't fix the segfault.

I get the same as you for the simple integer() case. I'll try to replicate what I did once I've fixed the other two issues

@TomAndrews
Copy link
Contributor Author

I added some debug output here:
TomAndrews@27a817c

This verifys that we're in the integer branch and prints out the value of int_x[0]

Running this I get:

R --quiet -d "valgrind" -e 'xts::isOrdered(integer())'
==32437== Memcheck, a memory error detector
==32437== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==32437== Using Valgrind-3.12.0.SVN and LibVEX; rerun with -h for copyright info
==32437== Command: /usr/lib/R/bin/exec/R --no-restore-data --no-save --quiet --quiet -e xts::isOrdered(integer())
==32437==
> xts::isOrdered(integer())
isordered
int
169023288[1] TRUE
>
>
==32437==
==32437== HEAP SUMMARY:
==32437==     in use at exit: 40,837,407 bytes in 18,699 blocks
==32437==   total heap usage: 37,272 allocs, 18,573 frees, 73,232,454 bytes allocated
==32437==
==32437== LEAK SUMMARY:
==32437==    definitely lost: 0 bytes in 0 blocks
==32437==    indirectly lost: 0 bytes in 0 blocks
==32437==      possibly lost: 0 bytes in 0 blocks
==32437==    still reachable: 40,837,407 bytes in 18,699 blocks
==32437==                       of which reachable via heuristic:
==32437==                         newarray           : 4,264 bytes in 1 blocks
==32437==         suppressed: 0 bytes in 0 blocks
==32437== Rerun with --leak-check=full to see details of leaked memory
==32437==
==32437== For counts of detected and suppressed errors, rerun with: -v
==32437== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

It looks like it's printing an uninitialised integer but I'm not sure why valgrind isn't complaining.

@TomAndrews
Copy link
Contributor Author

I went back to the original case that caused problems and managed to replicate the problem:

==38154== Conditional jump or move depends on uninitialised value(s)
==38154==    at 0x14A67A7C: do_is_ordered (isOrdered.c:80)
==38154==    by 0x1D6905: do_dotcall (dotcode.c:1252)
==38154==    by 0x2128B3: Rf_eval (eval.c:728)
==38154==    by 0x2150CB: do_begin (eval.c:2192)
==38154==    by 0x2126D4: Rf_eval (eval.c:700)
==38154==    by 0x2144FE: R_execClosure (eval.c:1614)
==38154==    by 0x20BAA6: bcEval (eval.c:6444)
==38154==    by 0x212357: Rf_eval (eval.c:624)
==38154==    by 0x2144FE: R_execClosure (eval.c:1614)
==38154==    by 0x247877: dispatchMethod.isra.4 (objects.c:335)
==38154==    by 0x247C7C: Rf_usemethod (objects.c:375)
==38154==    by 0x217C2B: Rf_DispatchOrEval (eval.c:3219)
==38154==

I'll run it again with --track-origins=yes but it will take a while. I'm afraid I haven't worked out exactly which call to isOrdered is causing the problem.

@joshuaulrich
Copy link
Owner

Thanks for digging. I'll take a look too. It's not essential to know the cause in order to fix what does look like a bug, but it would be good to know how to trigger the issue in order to write a test.

@TomAndrews
Copy link
Contributor Author

Track origins is not particularly illuminating

==45262== Conditional jump or move depends on uninitialised value(s)
==45262==    at 0x14A67A7C: do_is_ordered (isOrdered.c:80)
==45262==    by 0x1D6905: do_dotcall (dotcode.c:1252)
==45262==    by 0x2128B3: Rf_eval (eval.c:728)
==45262==    by 0x2150CB: do_begin (eval.c:2192)
==45262==    by 0x2126D4: Rf_eval (eval.c:700)
==45262==    by 0x2144FE: R_execClosure (eval.c:1614)
==45262==    by 0x20BAA6: bcEval (eval.c:6444)
==45262==    by 0x212357: Rf_eval (eval.c:624)
==45262==    by 0x2144FE: R_execClosure (eval.c:1614)
==45262==    by 0x247877: dispatchMethod.isra.4 (objects.c:335)
==45262==    by 0x247C7C: Rf_usemethod (objects.c:375)
==45262==    by 0x217C2B: Rf_DispatchOrEval (eval.c:3219)
==45262==  Uninitialised value was created by a heap allocation
==45262==    at 0x4C2BBAF: malloc (vg_replace_malloc.c:299)
==45262==    by 0x242E9D: GetNewPage (memory.c:875)
==45262==    by 0x24327E: CONS_NR (memory.c:2324)
==45262==    by 0x179BC9: Rf_lcons (Rinlinedfuns.h:304)
==45262==    by 0x179BC9: do_lapply (apply.c:62)
==45262==    by 0x245D27: do_internal (names.c:1360)
==45262==    by 0x205DBB: bcEval (eval.c:6493)
==45262==    by 0x212357: Rf_eval (eval.c:624)
==45262==    by 0x2144FE: R_execClosure (eval.c:1614)
==45262==    by 0x20BAA6: bcEval (eval.c:6444)
==45262==    by 0x212357: Rf_eval (eval.c:624)
==45262==    by 0x212B62: forcePromise (eval.c:520)
==45262==    by 0x212FFF: FORCE_PROMISE (eval.c:4742)
==45262==    by 0x212FFF: getvar (eval.c:4784)
==45262==

@joshuaulrich
Copy link
Owner

The fix for this should be careful to consider the behavior documented in #221. Any potential changes to user-facing code need to be evaluated carefully.

@joshuaulrich
Copy link
Owner

Here is a short C function that demonstrates this odd behavior. Also note that it only seems to manifest for me if run via R, but not via Rscript.

#include <R.h>
#include <Rdefines.h>
#include <R_ext/Error.h>

/* Seem to get random memory if called from R, but not from Rscript. Compare:
  R CMD SHLIB weird.c && Rscript -e 'dyn.load("weird.so"); .Call("weird")'
  R CMD SHLIB weird.c && R --quiet -e 'dyn.load("weird.so"); .Call("weird")'
*/
SEXP weird(void)
{
  SEXP watint = allocVector(INTSXP, 0);
  Rprintf("%d\n", INTEGER(watint)[0]);
  return R_NilValue;
}

SEXP real_weird(void)
{
  SEXP watint = allocVector(REALSXP, 0);
  double wat = REAL(watint)[0];
  Rprintf("%f\n", wat);
  return R_NilValue;
}

Assuming the above function is in weird.c, the output is:

> R CMD SHLIB weird.c
gcc -std=gnu99 -I/usr/share/R/include -DNDEBUG      -fpic  -g -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g  -c weird.c -o weird.o
gcc -std=gnu99 -shared -L/usr/lib/R/lib -Wl,-Bsymbolic-functions -Wl,-z,relro -o weird.so weird.o -L/usr/lib/R/lib -lR
> Rscript -e 'dyn.load("weird.so"); .Call("weird"); .Call("real_weird")'
0
NULL
0.000000
NULL
> R --quiet -e 'dyn.load("weird.so"); .Call("weird"); .Call("real_weird")'
R> dyn.load("weird.so"); .Call("weird"); .Call("real_weird")
33817848
NULL
0.000000
NULL
R>
R>

@TomAndrews
Copy link
Contributor Author

My first thought was that the code should check for the case where the length is zero and just return true in that case.

I think that's what will happen almost surely, unless the uninitialised value is NA_INTEGER, in which case it will return false.

Having checked that there is at least one value then accessing x_int[0] should be safe and you prevent any possible segfault (though I don't think I've ever seen a segfault from this)

@joshuaulrich
Copy link
Owner

@TomAndrews I spent way too much time thinking about this and messing with various attempts at elaborate tests, etc. I think the best solution for now is what you suggested, "check for the case where the length is zero and just return true".

Something like this before the if/else branch on TYPEOF(x):

if(LENGTH(x) < 1) {
  return ScalarLogical(1);
}

I would prefer you make the change and submit a PR, since you found this and provided the solution. I would like you to get the credit. Or I can make the change; let me know your preference.

TomAndrews added a commit to TomAndrews/xts that referenced this issue Aug 6, 2018
do_is_ordered tries to check whether the first element of the series
is NA.  It needs to first check whether the series has at least one
element to avoid reading an uninitialised value.

Fixes joshuaulrich#236.
@TomAndrews
Copy link
Contributor Author

Thanks! I've made a pull request here #261

@joshuaulrich joshuaulrich added this to the Release 0.11-1 milestone Aug 19, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants