Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: benchmarks for m5d.2xlarge #388

Closed
wants to merge 1 commit into from
Closed

Conversation

Rizziepit
Copy link
Collaborator

These are the parachain weights for an m5d.2xlarge instance with a general purpose SSD (gp2 in AWS console). It's very similar to the c5ad.2xlarge instance (also gp2), except for the most expensive extrinsic, import_header, which has a 24% higher weight due to CPU alone. Unfortunately the c5 instances have half the memory of m5.

How much memory do we need to run a parachain node? If 16GB is fine, then I'd recommend c5ad.2xlarge with provisioned IOPS maxed out to lower the DB read/write weights.

@vgeddes
Copy link
Collaborator

vgeddes commented May 6, 2021

I'm a bit confused when you mention c5ad.2xlarge with provisioned IOPs. That instance type has a local SSD on the host machine, so provisioned IOPs (an EBS feature) should be irrelevant for that. But we need to ensure that local SSD is mounted, formatted, and that the benchmarks are using it.

Can you check the output of lsblk on your running c5ad/m5ad instance? It will tell us more details about the attached block devices.

The import_header dispatchable being much faster on c5ad is definitely compelling. Though if it turns out we're not using the local SSD, then we'll need to run those benchmarks again.

@vgeddes
Copy link
Collaborator

vgeddes commented May 6, 2021

Though if we're not using the local SSD and switch over to using it, then likely the performance boost will be so much as to make the import_header difference between c5d and m5d inconsequential.

Which means we can hopefully just stick with m5d, without having to rerun the benchmarks on c5ad again.

@Rizziepit
Copy link
Collaborator Author

Here's the lsblk output from c5ad (stock standard. I didn't mount any devices myself):

NAME        MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
loop0         7:0    0  33.3M  1 loop /snap/amazon-ssm-agent/3552
loop1         7:1    0  55.5M  1 loop /snap/core18/1988
loop2         7:2    0  55.5M  1 loop /snap/core18/1997
loop4         7:4    0  70.4M  1 loop /snap/lxd/19647
loop5         7:5    0  32.3M  1 loop /snap/snapd/11588
loop6         7:6    0  32.3M  1 loop /snap/snapd/11402
loop7         7:7    0  67.6M  1 loop /snap/lxd/20326
nvme1n1     259:0    0 279.4G  0 disk 
nvme0n1     259:1    0   100G  0 disk 
└─nvme0n1p1 259:2    0   100G  0 part /

And from m5d:

NAME        MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
loop0         7:0    0  33.3M  1 loop /snap/amazon-ssm-agent/3552
loop1         7:1    0  70.4M  1 loop /snap/lxd/19647
loop2         7:2    0  55.5M  1 loop /snap/core18/1997
loop3         7:3    0  32.3M  1 loop /snap/snapd/11588
loop4         7:4    0  67.6M  1 loop /snap/lxd/20326
nvme1n1     259:0    0 279.4G  0 disk 
nvme0n1     259:1    0   100G  0 disk 
└─nvme0n1p1 259:2    0   100G  0 part /

I'm a bit confused when you mention c5ad.2xlarge with provisioned IOPs. That instance type has a local SSD on the host machine, so provisioned IOPs (an EBS feature) should be irrelevant for that. But we need to ensure that local SSD is mounted, formatted, and that the benchmarks are using it.

I think the confusion stems from the fact that we have two SSDs on these instances: an instance store SSD and an EBS SSD (that's apparently a thing now: https://aws.amazon.com/ebs/features/). You can see those in lsblk output. Only the EBS SSD is mounted. We can be fairly certain we're using an SSD because our DB weights are similar to Substrate here.

Two things:

  • If EBS uses an SSD, why bother with an instance store at all? It might be marginally faster, but we'd also have to take care of data permanence ourselves.
  • I increased provisioned IOPS and, while the fio benchmark significantly improved, it didn't improve DB weights at all.

I now think a c5a is actually best for us.

@vgeddes
Copy link
Collaborator

vgeddes commented May 6, 2021

Ahh, ok, thanks for clarifying.

c5ad looks like the best choice then, as you say. 👍

EBS with provisioned IOPs is certainly an option. Though it introduces other complexities. Like determining how much provisioned IOPs we require and the cost thereof. According to some rough math I did a while back, max provisioned IOPs can get very expensive.

@Rizziepit
Copy link
Collaborator Author

EBS with provisioned IOPs is certainly an option. Though it introduces other complexities. Like determining how much provisioned IOPs we require and the cost thereof. According to some rough math I did a while back, max provisioned IOPs can get very expensive.

Good / bad news is that provisioned IOPS don't improve our weights at all. I re-ran the benchmarks on an m5d with maxed out IOPS last night, but didn't have the data yet when I created this PR. I was really surprised that it didn't perform any better. Not even slightly. So no need to spend money on that.

@Rizziepit
Copy link
Collaborator Author

I'm discarding this PR in favour of #395 since we've decided the c5a is a better match.

@Rizziepit Rizziepit closed this May 13, 2021
@vgeddes vgeddes deleted the benchmark-m5d.2xlarge branch June 1, 2022 13:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants