Test an NP-series VM on Azure with Dragen's pay-as-you-go (PAYG) license
-
Sign up for an Azure subscription if you don't already have one.
-
Visit Quotas in Azure Portal, login if needed, and increase the NP series quota to 40, so we can operate up to 4 NP10 VMs at a time, or 2 NP20 VMs. Based on demand for these SKUs in your region, you may also need to submit a service request and justify your use-case to a person before that quota gets approved.
-
Visit this page, login if needed, and ensure that
Status
is set toEnable
for the Azure subscription you intend to use. This allows programmatic deployment of the PAYG VMs as we will test below. -
Install Azure CLI and use the
az login
command to login using the same Microsoft creds as you used above. -
An RSA or ED25519 private/public key pair in your
.ssh
folder.
Create a resource group with a basic virtual network that allows SSH:
az group create --name dgn-rg --location eastus
az network nsg create --resource-group dgn-rg --name dgn-nsg
az network nsg rule create --resource-group dgn-rg --nsg-name dgn-nsg --name SSH --priority 300 --protocol TCP --access Allow --direction Inbound --source-address-prefixes "*" --source-port-ranges "*" --destination-address-prefixes "*" --destination-port-ranges 22
az network vnet create --resource-group dgn-rg --network-security-group dgn-nsg --name dgn-vnet --address-prefixes "10.0.0.0/16" --subnet-name default --subnet-prefixes "10.0.0.0/24"
Create a VM (includes creation of NIC and public IP):
az vm create --resource-group dgn-rg --nsg dgn-nsg --vnet-name dgn-vnet --subnet default --name dgn-vm --size Standard_NP10s --ephemeral-os-disk true --ephemeral-placement CacheDisk --security-type Standard --public-ip-sku Standard --image illuminainc1586452220102:dragen-vm-payg:dragen-4-3-6-payg:latest --plan-publisher illuminainc1586452220102 --plan-product dragen-vm-payg --plan-name dragen-4-3-6-payg --accept-term --admin-username ckandoth --ssh-key-values .ssh/id_ed25519.pub
The IP will be printed, or we can get it as follows:
az network public-ip list
ssh into it and chown the large 778GB disk:
ssh ckandoth@172.190.49.30
sudo chown $UID:$GROUPS /mnt
Download a test dataset into the large disk /mnt
and build a hash table for chr21:
mkdir /mnt/tmp
wget -P /mnt/tmp https://data.cyri.ac/test_tum_nrm_wgs.tar
tar -xf /mnt/tmp/test_tum_nrm_wgs.tar -C /mnt
dragen --intermediate-results-dir /mnt/tmp --build-hash-table true --ht-reference /mnt/ref/GRCh38_chr21.fa --output-directory /mnt/ref --ht-num-threads=8
Run alignment of the test FASTQs against the chr21 hash table:
mkdir /mnt/out
dragen --intermediate-results-dir /mnt/tmp -r /mnt/ref -1 /mnt/nrm/nrm_C0JD1ACXX_L001_R1_001.fastq.gz -2 /mnt/nrm/nrm_C0JD1ACXX_L001_R2_001.fastq.gz --RGID C0JD1ACXX.1 --RGSM nrm --output-directory /mnt/out --output-file-prefix nrm --enable-duplicate-marking true --enable-map-align-output true --output-format cram
To avoid cost of data transfer in/out of Azure, use Networked Streaming with Azure Blob storage.
Delete the VM to save money because it's the most expensive resource:
az vm delete --yes --resource-group dgn-rg --name dgn-vm
Delete the whole resource group to stop spending money altogether:
az group delete --yes --name dgn-rg