Skip to content

Commit

Permalink
further refactoring of the code
Browse files Browse the repository at this point in the history
building with 5.0-alpha1
removing 3.11 from the build
updating the project to build with Java 11
further simplification of the modules
generalize tests to test Cassandra 4 and 5
  • Loading branch information
smiklosovic committed Sep 13, 2023
1 parent 6530706 commit 660b0b0
Show file tree
Hide file tree
Showing 31 changed files with 634 additions and 793 deletions.
6 changes: 3 additions & 3 deletions .circleci/config.yml
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ jobs:
working_directory: ~/esop

docker:
- image: cimg/openjdk:8.0
- image: cimg/openjdk:11.0.20

steps:

Expand All @@ -16,7 +16,7 @@ jobs:
- m2-{{ checksum "pom.xml" }}
- m2-

- run: (echo "${google_application_credentials}" > /tmp/gcp.json) && mvn clean install -PsnapshotRepo,rpm,deb,localTests -DoutputDirectory=/tmp/artifacts -Dcassandra3.version=3.11.14 -Dcassandra4.version=4.1.0
- run: (echo "${google_application_credentials}" > /tmp/gcp.json) && mvn clean install -PsnapshotRepo,rpm,deb,localTests -DoutputDirectory=/tmp/artifacts -Dcassandra4.version=4.1.2 -Dcassandra5.version=5.0-alpha1

- save_cache:
paths:
Expand All @@ -38,7 +38,7 @@ jobs:

publish-github-release:
docker:
- image: cimg/go:1.17
- image: cimg/go:1.21.1
steps:
- attach_workspace:
at: ./artifacts
Expand Down
145 changes: 15 additions & 130 deletions README.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -57,7 +57,7 @@ restore and backup remotely, use Icarus which embeds this project.
## Supporter Cassandra Versions

Since we are talking to Cassandra via JMX, almost any Cassandra version is supported.
We are testing this tool with Cassandra 3.11.8, 4.0-beta3, and 2.2.18.
We are testing this tool with Cassandra 5.x and 4.x.

## Usage

Expand Down Expand Up @@ -128,7 +128,7 @@ _from scratch_ or if you use <<In-place restoration strategy>>.
Data to backup and restore from, are located in a remote storage. This setting is controlled by flag
`--storage-location`. The storage location flag has very specific structure which also indicates where data will be
uploaded. Locations consist of a storage _protocol_ and path. Please keep in mind that the protocol we are using is not a
_real_ protocol. It is merely a mnemonic. Use either `s3`, `gcp`, `azure`, `oracle`, `minio`, `ceph` or `file`.
_real_ protocol. It is merely a mnemonic. Use either `s3`, `gcp`, `azure` or `file`.

The format is:

Expand Down Expand Up @@ -178,37 +178,7 @@ The most notable fact is that if no credentials are set explicitly, it will try
properties of the node it runs on. If that node runs in AWS EC2, it will resolve them by help of that particular instance.
S3 connectors will expect to find environment properties `AWS_ACCESS_KEY_ID` and `AWS_SECRET_KEY`.
They will also accept `AWS_REGION` and `AWS_ENDPOINT` environment properties—however they are not required.
If `AWS_ENDPOINT` is set, `AWS_REGION` has to be set too.
S3 currently supports two different addressing models: path-style and virtual-hosted style (https://docs.aws.amazon.com/AmazonS3/latest/userguide/RESTAPI.html).
Esop supports different S3 providers and applies the following default addressing models.
.default settings per provider
|===
|provider |addressing model
|AWS
|virtual
|Ceph
|virtual
|Minio
|path
|Oracle
|path
|===
Providing the AWS_ENABLE_PATH_STYLE_ACCESS environment variable with `true` or `false` overrides this default setting. Note that this applies to each provider, except when running in Kubernetes.
|provider|model|
|Minio|path|
The communication with S3 might be insecure, this is controlled by `--insecure-http` flag on the command line. By default,
it uses HTTPS.
They will also accept `AWS_REGION`.
It is possible to connect to S3 via proxy; please consult "--use-proxy" flag and "--proxy-*" family of settings on command line.
Expand All @@ -220,74 +190,6 @@ Azure module expects `AZURE_STORAGE_ACCOUNT` and `AZURE_STORAGE_KEY` environment
GCP module expects `GOOGLE_APPLICATION_CREDENTIALS` environment property or `google.application.credentials` to be set with the path to service account credentials.
#### Oracle
Oracle module behaves same way as S3 when it comes to credentials.
#### Ceph
CEPH module uses https://docs.ceph.com/en/latest/radosgw/s3/java/[Amazon S3 driver] for
https://docs.ceph.com/en/latest/radosgw/[Ceph Object Gateway]. Credentials-wise,
it behaves same way as "normal" S3. **You are required to set endpoint to AmazonS3 client.**
In that case, be sure `AWS_ENDPOINT` environment property is set or `awsendpoint` property in Kubernetes
secret is specified. You need to provide typical access key and secret key too.
Please consult the following section to know more about Kubernetes-related
authentication properties resolution. Setting protocol to HTTP might be achieved similarly as for normal
S3 module, by specifying `--insecure-http` flag.
#### Minio
`minio` is alias of `oracle`. Oracle nor Minio uses path-style requests which S3 module does not.
#### Authentication in Kubernetes
If this tooling is run in the context of Kubernetes, we need to inject these credentials dynamically upon every request.
If these credentials are not set statically, e.g. as environment or system properties, we may have an
application like Cassandra Sidecar which resolves these credentials on every backup or restore request so
they may change over time by Kubernetes operators (as person). By dynamic injecting, we are separating the lifecycle
of a credential from the lifecycle of a backup/restore/Sidecar application.
Credentials are stored as a secret. Namespace to read that secret from is specified by flag `--k8s-namespace` and
the secret to read credentials from is specified by flag `--k8s-secret-name`. If namespace flag is not used,
it defaults to `default`. If the secret name is not used, it is resolved as `cassandra-backup-restore-secret-cluster-\{cluterId\}` where
`clusterId` is taken from cluster name in `--storage-location`.
The secret has to contain these fields:
```
apiVersion: v1
kind: Secret
metadata:
name: cassandra-backup-restore-secret-cluster-my-cluster
type: Opaque
stringData:
awssecretaccesskey: _AWS secret key_
awsaccesskeyid: _AWS access id_
awsregion: e.g. eu-central-1
awsendpoint: endpoint
azurestorageaccount: _Azure storage account_
azurestoragekey: _Azure storage key_
gcp: 'whole json with service account'
```
Of course, if we do not plan to use other storage providers, feel free to omit the properties for them.

For S3, only the secret key and access key are required.

The fact that the code is running in the context of Kubernetes is derived from two facts:

* there are environment properties `KUBERNETES_SERVICE_HOST` and `KUBERNETES_SERVICE_PORT` in a respective
container this tool is invoked in
* This tool runs outside of Kubernetes but as _a client_ meaning it will resolve credentials from there but it
does not run in any container. This is helpful for example during tests where we do not run it inside Kubernetes
but we want to be sure that the logic dealing with the credentials resolution works properly. This is controlled by
system property `kubernetes.client` which is by default false.
There might be the third (rather special) case—we want to run this tool in Kubernetes (so env properties would be there) but
we want to run it as a client. Normally, the first condition would be fulfilled. There is a property called `pretend.not.running.in.kubernetes`,
defaults to `false`. If set to true, even we run our tool in Kubernetes, it will act as a client, so it will not
retrieve credentials from Kubernetes secret but from system and environment variables.

### Directory Structure of a Remote Destination
Cassandra data files as well as some meta-data needed for successful restoration are uploaded into a bucket
Expand Down Expand Up @@ -1048,13 +950,15 @@ consisting of a set of SSTables, all SSTables which were previously a part of th
a part of the current backup would not be touched - hence no modification date would be refreshed - so they would expire.
For cases there is a versioning enabled (currently known to be an issue for S3 backups only),
our attempt to refresh it would create new, versioned, file. This is not desired. Hence we
have the possibility to skip refreshment and we just detect if a file is there or not, but you would
our attempt to refresh it would create new, versioned, file. This is not desired. Hence, we
have the possibility to skip refreshment, and we just detect if a file is there or not, but you would
lose the ability to expire objects as described above.
This behavior is controlled by flag called `--skip-refreshing` on backup command. By default, when
not specified, it is evaluated to `false`, so skipping would not happen.
Currently, this functionality is not working for s3 protocol.
### Retry of upload / download operations
Imagine there is a restore happening which is downloading 100 GB of data and your connectivity
Expand Down Expand Up @@ -1275,37 +1179,32 @@ for each dc separately)
In order to perform the encryption of your SSTables, so they are stored in a remote AWS S3 bucket already encrypted,
we leverage AWS KMS client-side encryption by https://github.com/aws/amazon-s3-encryption-client-java[this library].
### s3v2 protocol

*s3v2* protocol has to be used to use KMS client encryption.

Historically, Esop was using AWS API of version 1, however the library which makes client-side encryption possible
is using API of version 2. The version 1 and version 2 API can live in one project simultaneously. As AWS KMS encryption
feature in Esop is rather new, we decided to code one additional S3 module which is using V2 API, and
we left V1 API implementation untouched if users still prefer it for whatever reason. We might eventually switch to
V2 API completely and drop the code using V1 API in the future.
To use client-side encryption, the protocol needs to be set to `s3v2` instead of `s3` (which uses V1 API).
A user also needs to supply KMS key id to encrypt data with. The creation of KMS key is out of scope of this document
however keep in mind that such a key has to be symmetric.
The example of encrypted backup is shown below:
----
java -jar esop.jar backup \
--storage-location=s3v2://instaclustr-oss-esop-bucket
--storage-location=s3://instaclustr-oss-esop-bucket
--data-dir /my/installation/of/cassandra/data/data \
--entities=ks1 \
--snapshot-tag=snapshot-1 \
--kmsKeyId=3bbebd10-7e5f-4fad-997a-89b51040df4c
----
Notice we use `s3v2` as protocol. We also sed `kmsKeyId` referencing name of KMS key in AWS to use for encryption.
Notice we also set `kmsKeyId` referencing name of KMS key in AWS to use for encryption.
KMS key ID is also read from system property `AWS_KMS_KEY_ID` or environment property of the same name.
Key ID from the command line has precedence over system property which has precedence over environment property.
If `--storage-location` is not fully specified, Esop will try to connect to a runnning node via JMX, and it resolves
If `--storage-location` is not fully specified, Esop will try to connect to a running node via JMX, and it resolves
what cluster and datacenter it belongs to and what node ID it has.
The uploading logic of a particular SSTable file is as follows. First we need
Expand All @@ -1316,8 +1215,8 @@ to it is this:
** if such key is not found, we need to upload a file
* if we are using encrypting backup (by having `--kmsKeyId` set), we prepare a tag
which has `kmsKey` as a key and KMS key ID as a value
* if tags of a remote key are not set or if they are not containg `kmsKey` tag,
that means that the remote object exists but it is not encrypted. Hence, we
* if tags of a remote key are not set or if they are not contain `kmsKey` tag,
that means that the remote object exists, but it is not encrypted. Hence, we
will need to upload it again, but encrypted this time
* if we are not skipping the refresh, we will copy the file with `kmsKey` tag
Expand Down Expand Up @@ -1378,7 +1277,7 @@ bucket:
S6 encrypted, backup 2
----
We see that we are going to backup S3, S4 (compacted S1 and S2), S5 and S6. S3 is already uploaded,
We see that we are going to back up S3, S4 (compacted S1 and S2), S5 and S6. S3 is already uploaded,
but it is not encrypted, so S3 will be re-uploaded and encrypted. S4, S5 and S6 are not present remotely yet so all of them will be encrypted and uploaded.
After doing so, we see this in the bucket:
Expand Down Expand Up @@ -1407,7 +1306,7 @@ To answer the first question is rather easy. If you want to use a different KMS
situation as if we were going to upload but no key was used. If we detect that already uploaded
object was encrypted with a different KMS key (by reading its tags) from a key we want to use now,
we just need to re-upload such SSTable and encrypt it with a different KMS key.
All other logic already exaplained is same.
All other logic already explained is same.
Restoration will read tags of a remote object to see what KMS key it was encrypted with. If remote
object was stored as plaintext, no wrapping S3 encryption client is used. If KMS key
Expand All @@ -1432,7 +1331,7 @@ restoration in this scenario possible.
## Logging
We are using logback. There is already `logback.xml` embedded in the built JAR. However if you
We are using logback. There is already `logback.xml` embedded in the built JAR. However, if you
want to configure it, feel free to provide your own `logback.xml` and configure it like this:
----
Expand All @@ -1453,30 +1352,16 @@ Here are the test groups/profiles:
* googleTest
* s3Tests
* cloudTest—runs tests which will be using cloud "buckets" for backup / restore
* k8sTest—same as `cloudTest` above, but credentials will be fetched from Kubernetes.
There is no need to create buckets in a cloud beforehand as they will be created and deleted
as part of a test automatically, per cloud provider.
If a test is "Kubernetes-aware", before every test credentials are created as a Secret
which will be used by backup/restore tooling during a test. We are simulating here how
this tooling can be easily embedded into for example Cassandra Sidecar (part of Cassandra operator).
We are avoiding the need to specify credentials upfront when a Kubernetes pod is starting as a part
of that spec, by dynamically fetching all credentials from a Secret whose name is passed to a
backup request and is read every time. The side-effect of this is that we can change our credentials
without restarting a pod to re-read them because they will be read dynamically upon every backup request.

Cloud tests are executed like this:
----
$ mvn clean install -PcloudTests
----
Kubernetes tests are executed like this:
----
$ mvn clean install -Pk8sTests
----

By default, `mvn install` is invoked with `noCloudTests` which will skip all tests dealing with
storage provides but `file://`.
Expand Down
26 changes: 3 additions & 23 deletions pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -47,10 +47,9 @@
<git.build.time/>

<outputDirectory>${project.build.directory}</outputDirectory>

<!-- Cassandra 3 does not work with Java 11 -->
<java.source.version>8</java.source.version>
<java.target.version>8</java.target.version>

<java.source.version>11</java.source.version>
<java.target.version>11</java.target.version>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
</properties>

Expand Down Expand Up @@ -593,25 +592,6 @@
</plugins>
</build>
</profile>

<profile>
<id>cephTests</id>
<activation>
<activeByDefault>false</activeByDefault>
</activation>
<build>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-surefire-plugin</artifactId>
<version>${maven.surefire.plugin.version}</version>
<configuration>
<groups>cephTest</groups>
</configuration>
</plugin>
</plugins>
</build>
</profile>

<profile>
<id>snapshotRepo</id>
Expand Down
4 changes: 2 additions & 2 deletions src/main/java/com/instaclustr/esop/guice/StorageModules.java
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
import com.instaclustr.esop.azure.AzureModule;
import com.instaclustr.esop.gcp.GCPModule;
import com.instaclustr.esop.local.LocalFileModule;
import com.instaclustr.esop.s3.aws_v2.S3V2Module;
import com.instaclustr.esop.s3.aws_v2.S3Module;

public class StorageModules extends AbstractModule
{
Expand All @@ -14,6 +14,6 @@ protected void configure()
install(new AzureModule());
install(new GCPModule());
install(new LocalFileModule());
install(new S3V2Module());
install(new S3Module());
}
}
10 changes: 5 additions & 5 deletions src/main/java/com/instaclustr/esop/impl/AbstractTracker.java
Original file line number Diff line number Diff line change
Expand Up @@ -154,9 +154,9 @@ public synchronized Session<UNIT> submit(final INTERACTOR interactor,
sessions.stream().filter(s -> s.getUnits().contains(value)).forEach(s -> {
operationsService.operation(s.getId()).ifPresent(op -> {
s.finishedUnits.incrementAndGet();
logger.info(String.format("Progress for snapshot %s: %s",
logger.info(String.format("Progress for snapshot %s: %.2f",
s.snapshotTag,
s.getProgress()));
s.getProgress() * 100));
op.progress = s.getProgress();
});
});
Expand Down Expand Up @@ -207,12 +207,12 @@ public void cancelIfNecessary(final Session<? extends Unit> session) {
// the most probably because it waits until it fits into pool
session.getNonFailedUnits().forEach(unit -> {
if (unit.getState() == NOT_STARTED) {
logger.info(format("Ignoring %s from processing because there was an errorneous unit in a session %s",
logger.info(format("Ignoring %s from processing because there was an erroneous unit in a session %s",
unit.getManifestEntry().localFile,
session.id));
unit.setState(IGNORED);
} else if (unit.getState() == Unit.State.RUNNING) {
logger.info(format("Cancelling %s because there was an errorneous unit in a session %s",
logger.info(format("Cancelling %s because there was an erroneous unit in a session %s",
unit.getManifestEntry().localFile,
session.id));
unit.setState(CANCELLED);
Expand Down Expand Up @@ -381,7 +381,7 @@ public void waitUntilConsideredFinished() {
logger.info(format("%sSession %s has finished %s",
snapshotTag != null ? "Snapshot " + snapshotTag + " - " : "",
id,
isSuccessful() ? "successfully" : "errorneously"));
isSuccessful() ? "successfully" : "erroneously"));
}

public void addUnit(final U unit) {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -96,7 +96,7 @@ protected void run0() throws Exception {
assert cassandraJMXService != null;
assert cassandraVersion != null;

if (!CassandraVersion.isFour(cassandraVersion)) {
if (!CassandraVersion.isNewerOrEqualTo4(cassandraVersion)) {
throw new OperationFailureException(format("Underlying version of Cassandra is not supported to import SSTables: %s. Use this method "
+ "only if you run Cassandra 4 and above", cassandraVersion));
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -61,7 +61,7 @@ public MetadataDirective convert(final String value) {
@Option(names = {"--skip-refreshing"},
description = "Skip refreshing files on their last modification date in remote storage upon backup. When turned on, "
+ "there will be no attempt to change the last modification time, there will be just a check done on their presence "
+ "based on which a respective local file will be upload or not, defaults to false.")
+ "based on which a respective local file will be upload or not, defaults to false, does not work with s3.")
public boolean skipRefreshing;

public BaseBackupOperationRequest() {
Expand Down
Loading

0 comments on commit 660b0b0

Please sign in to comment.