-
-
Notifications
You must be signed in to change notification settings - Fork 419
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add DTrace probes for all message push and pop operations #2295
Conversation
3dfd204
to
25b9e6c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a bit wary regarding the additional parameters for message queues functions. It could result in a performance loss even in non-DTrace builds. The performance impact could be nullified by ifdef
ing the additional parameters based on the DTrace status.
I've also left some style comments.
src/libponyrt/actor/messageq.c
Outdated
} | ||
|
||
bool ponyint_actor_messageq_push(scheduler_t* sched, | ||
pony_actor_t* from_actor, pony_actor_t* to_actor, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The style convention for multiline function parameters is two spaces at the start of the line.
src/libponyrt/actor/messageq.c
Outdated
|
||
while(m != last) | ||
{ | ||
DTRACE4(ACTOR_MSG_PUSH, index, m->id, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indentation should be two spaces here.
src/libponyrt/actor/messageq.c
Outdated
} | ||
|
||
bool ponyint_thread_messageq_push(uintptr_t from_thr, uintptr_t to_thr, | ||
messageq_t* q, pony_msg_t* first, pony_msg_t* last) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two spaces here.
src/libponyrt/actor/messageq.c
Outdated
} | ||
|
||
bool ponyint_actor_messageq_push_single(scheduler_t* sched, | ||
pony_actor_t* from_actor, pony_actor_t* to_actor, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two spaces here.
src/libponyrt/actor/messageq.c
Outdated
while(m != last) | ||
{ | ||
DTRACE4(ACTOR_MSG_PUSH, index, m->id, | ||
(uintptr_t)from_actor, (uintptr_t)to_actor); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two spaces here.
src/libponyrt/asio/kqueue.c
Outdated
@@ -61,8 +62,11 @@ static void handle_queue(asio_backend_t* b) | |||
{ | |||
asio_msg_t* msg; | |||
|
|||
while((msg = (asio_msg_t*)ponyint_messageq_pop(&b->q)) != NULL) | |||
while((msg = (asio_msg_t*)ponyint_thread_messageq_pop( | |||
SPECIAL_THREADID_KQUEUE, &b->q)) != NULL) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two spaces here.
src/libponyrt/asio/kqueue.c
Outdated
@@ -356,9 +360,11 @@ PONY_API void pony_asio_event_unsubscribe(asio_event_t* ev) | |||
|
|||
asio_msg_t* msg = (asio_msg_t*)pony_alloc_msg( | |||
POOL_INDEX(sizeof(asio_msg_t)), 0); | |||
msg->msg.id = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above for the ID assignment.
src/libponyrt/gc/cycle.c
Outdated
@@ -681,7 +681,8 @@ static void final(pony_ctx_t* ctx, pony_actor_t* self) | |||
|
|||
do | |||
{ | |||
while((msg = ponyint_messageq_pop(&self->q)) != NULL) | |||
while((msg = ponyint_actor_messageq_pop( | |||
ctx->scheduler, self, &self->q)) != NULL) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two spaces here.
src/libponyrt/sched/scheduler.h
Outdated
@@ -41,6 +46,7 @@ struct scheduler_t | |||
{ | |||
// These are rarely changed. | |||
pony_thread_id_t tid; | |||
int index; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would be best as an uint32_t
, the same type used in the scheduler initialisation loop. Alteratively, this could use the thread ID instead if it makes sense in DTrace. I'm not an expert here so I don't know if would really appropriate.
src/libponyrt/actor/messageq.c
Outdated
if(DTRACE_ENABLED(ACTOR_MSG_PUSH)) | ||
{ | ||
pony_msg_t* m = first; | ||
int index = (sched == NULL) ? -8 : sched->index; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would be better with a define instead of a magic constant.
Thanks @Praetonus for the review. WRT all the whitespace changes, thanks. It hasn't been clear to me how the whitespace rule applied to function arguments. I guessed with two spaces to the right of the function's name. (Except when my editor's default 4 spaces slipped in without me noticing.) I'll fix them up. WRT performance measurements, I'll start by pestering @SeanTAllen for recommendations. WRT magic constants that you found, yup, I'll create #define's for them. |
ea96786
to
516189e
Compare
…ime. A microbenchmark for measuring message passing rates in the Pony runtime. This microbenchmark executes a sequence of intervals. During an interval, 1 second long by default, the SyncLeader actor sends an initial set of ping messages to a static set of Pinger actors. When a Pinger actor receives a ping() message, the Pinger will randomly choose another Pinger to forward the ping() message. This technique limits the total number of messages "in flight" in the runtime to avoid causing unnecessary memory consumption & overhead by the Pony runtime. This small program has several intended uses: * Demonstrate use of three types of actors in a Pony program: a timer, a SyncLeader, and many Pinger actors. * As a stress test for Pony runtime development, for example, finding deadlocks caused by experiments in the "Generalized runtime backpressure" work in pull request #2264 * As a stress test for measuring message send & receive overhead for experiments in the "Add DTrace probes for all message push and pop operations" work in pull request #2295
@slfritchie this needs to be rebased after I merged the runtime backpressure changes. you might have a few hairy bits not sure. Could you join us for sync next week to discuss this PR in more depth? |
@SeanTAllen Yes, I probably can join the sync on Wednesday. I've gone through the exercise of merging this branch with your backpressure branch a couple of times before, but I haven't tried lately. |
516189e
to
7211eb4
Compare
I thought that I'd commented earlier that @JONBRWN had run the Wallaroo "market spread" app and seen no change in throughput or the latencies at each processing stage within the app. (I'm not sure if that comment is no longer shown by GitHub or if I'm imagining having ever written it.) Good news. Note, however, that his test was run before the final squash & rebase. Also, I've run tests on my MacBook Pro (4 core + HyperThreading) on macOS Sierra 10.12.6. The results are at https://gist.github.com/slfritchie/f9c38c2b938174c88c3c63659a3128eb. The tests all use
|
src/libponyrt/actor/messageq.c
Outdated
} | ||
|
||
pony_msg_t* ponyint_actor_messageq_pop(scheduler_t* sched, | ||
pony_actor_t* actor, messageq_t* q) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
indentation
src/libponyrt/actor/messageq.c
Outdated
bool ponyint_messageq_push_single(messageq_t* q, pony_msg_t* first, | ||
pony_msg_t* last) | ||
static bool messageq_push_single(messageq_t* q, | ||
pony_msg_t* first, pony_msg_t* last) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
indentation
7211eb4
to
865280f
Compare
Here's a final sanity check. I used the same microbenchmark mentioned in #2295 (comment) to check if the latest rebase on
I'm really puzzled by the Mac's performance; the 1.5% improvement was quite repeatable. I'm also puzzled by the AMD's results. I tried that experiment first and was disappointed, but it's also repeatable. Does anyone get fired for buying^Wmicrobenchmarking only IBM^WEC2? |
865280f
to
ec385b8
Compare
Existing DTrace probes for message sending & receiving / push&pop are placed in an ad hoc fashion. As a result, many actor messages are not visible to DTrace. This patch places new DTrace probes within the lowest-level code for adding & removing messages from queues. The same queue functions are used for communication between threads. New probes are also added to observe inter-thread communication. I've introduced C preprocessor macros to wrap the message push & pop operations. The purposes of the macros are: * Encode the queue category into the function call: actor or thread. * Avoid cluttering the runtime's code by hiding the casting of various args to uintptr_t. Also added are two new scripts in the `examples/dtrace` directory: * mbox-size-all-actor-messages.d: count probe firings for actor message pushes & pops. * mbox-size-all-thread-messages.d: count probe firings for thread message pushes & pops. Aside from adding new constants (usually via CPP macros), this patch adds a field to `struct scheduler_t` to permit reporting of which scheduler thread fires a DTrace probe. Also: Try @dipinhora's suggestion to avoid unnecessary branches Via IRC in the #ponylang channel, @dipinhora suggested that I try to avoid unnecessary branches (and their likely performance penalty) and get rid of the macros that I'd introduced with this PR. I've removed the macros and instead created thread- and actor- version of the messageq_push, messageq_push_single, and messageq_pop functions. The push & push single versions have a `static bool` common function that allows them to share code. I've had to make some changes to a couple header files to avoid chicken-and-the-egg problems when using data types not yet defined Measurements by Jonathan Brown using the Wallaroo "Market Spread" app shows no change in throughput rates or latencies at each stage of the app.
Switch ponyint_actor_messageq_* and ponyint_thread_messageq_* function argument order to match the argument order of messageq.c's static functions messageq_push() and messageq_push_single(). I also changed the pop function argument order for symmetry: the "optional" tracing-related arguments are at the end of the argument list.
ec385b8
to
267958b
Compare
405eba2
to
1bd82f4
Compare
We discussed on the sync call, and I'm good with the way this was done with the However, I want @Praetonus to sign off on this before merging, so he can confirm that his concerns are resolved. |
Thanks, everyone, for your helpful reviews. |
Kickass Scott. This is AWESOME! |
@slfritchie I think this deserves a CHANGELOG entry, can you add an entry to the CHANGELOG for this? |
@SeanTAllen Oops, sorry, I didn't realize that the CHANGELOG bot thingie was in use by this repo. I'll make a tiny PR. |
Existing DTrace probes for message sending & receiving / push&pop are
placed in an ad hoc fashion. As a result, many actor messages are
not visible to DTrace. This patch places new DTrace probes within the
lowest-level code for adding & removing messages from queues.
The same queue functions are used for communication between threads.
New probes are also added to observe inter-thread communication.
I've introduced C preprocessor macros to wrap the message push & pop
operations. The purposes of the macros are:
various args to uintptr_t.
Also added are two new scripts in the
examples/dtrace
directory:pushes & pops.
pushes & pops.
Aside from adding new constants (usually via CPP macros), this patch
adds a field to
struct scheduler_t
to permit reporting of whichscheduler thread fires a DTrace probe.
Stuff I expect to get questions and/or AYFKM from reviewers
^_^
includes:ACTOR_MESSAGEQ_PUSH()
andACTOR_MESSAGEQ_POP()
to hide some argument casts & argument additions. (Perhaps could use real functions and hope that inlining would do The Right Thing?)Q_TYPE_THREAD
and thesched_msg_t
enum.I've been using slow-actor.pony for testing on OS X Sierra/10.12.6. Example output of the two new DTrace scripts can be found at https://gist.github.com/slfritchie/69ad5f5d00e5d02b05af9b3f117824fe and at https://gist.github.com/slfritchie/d4d30159864396d302f6f05d7fec0c89