Backups inside Atlan

A guide on how to create and store backups inside Atlan

How to create backups in Atlan for multiple databases

Elasticsearch

Backing up Atlan with Elasticsearch

To create an Elasticsearch backup, we use a snapshot repository via the Elasticsearch API. It has the S3 bucket, region, and IAM role, which are required for the backup process.

In brief, a cronjob hits an API endpoint of Elasticsearch, which creates a snapshot of the current state of the ES cluster (i.e. all the indices). This snapshot is uploaded to an S3 bucket, the one specified in the snapshot repository.

๐Ÿง™โ€โ™€๏ธ Remember: The backup process is incremental, i.e. the new snapshot will build upon the last one.

Follow the steps below to take the ES snapshot manually:

STEP 1: Get into any of the Elasticsearch pods

$ kubectl exec -it atlas-elasticsearch-master-2 -n atlas -- sh

STEP 2: Register the snapshot repository

This is only necessary if the snapshot repository isn't there already.

$ JSON_DATA='{ "type" : "s3", "settings" : { "bucket" : "s3-bucket-name", "base_path" : "es-backup/elasticsearch", "region" : "s3-bucket-region", "role_arn" : "arn value of the role with RW access to the specified bucket", "compress" : "true" }}'
$ curl -s -X PUT "atlas-elasticsearch-master.atlas.svc.cluster.local:9200/_snapshot/atlan_s3_repository?pretty" -H 'Content-Type: application/json' -d "$JSON_DATA"

You'll get output something like this:

{
"acknowledged" : true
}

You can now verify the repository using this command:

$ curl -s -X GET "atlas-elasticsearch-master.atlas.svc.cluster.local:9200/_snapshot/_all?pretty"

STEP 3: Create and upload the snapshot

First, use this command to create the snapshot:

$ tag="any unique value without special character or spaces

Next, you can upload it to the S3 bucket.

$ curl -X PUT "http://atlas-elasticsearch-master.atlas.svc.cluster.local:9200/_snapshot/atlan_s3_repository/atlan_nightly_backup_$tag?wait_for_completion=true&pretty"

And your Elasticsearch backup is done! ๐ŸŽ‰

If you want to know more about the Elasticsearch backup process, check out the official documentation.

Restorating a snapshot

Here are the steps to restore an existing snapshot to a running Elasticsearch cluster.

STEP 1: Get into any of the Elasticsearch pods

$ kubectl exec -it atlas-elasticsearch-master-2 -n atlas -- sh

STEP 2: Start registering the snapshot repository with the stored snapshot

The bucket used here has the snapshot stored in it.

$ JSON_DATA='{ "type" : "s3", "settings" : { "bucket" : "s3-bucket-name", "base_path" : "es-backup/elasticsearch", "region" : "s3-bucket-region", "role_arn" : "arn value of the role with RW access to the specified bucket", "compress" : "true" }}'
$ curl -s -X PUT "atlas-elasticsearch-master.atlas.svc.cluster.local:9200/_snapshot/atlan_s3_repository?pretty" -H 'Content-Type: application/json' -d "$JSON_DATA"

You'll get output something like this:

{
"acknowledged" : true
}

You can now verify the repository using this command:

$ curl -s -X GET "atlas-elasticsearch-master.atlas.svc.cluster.local:9200/_snapshot/_all?pretty"

STEP 3: Delete all the indices that are already in Elasticsearch

This helps you avoid issues with restoration.

$ curl -s -X DELETE "atlas-elasticsearch-master.atlas.svc.cluster.local:9200/*?pretty"

Next, verify that there are no indices present now.

$ curl -s -X GET "atlas-elasticsearch-master.atlas.svc.cluster.local:9200/_cat/indices"

STEP 4: Restore the Elasticsearch snapshot for a particular tag

This is the same value that was specified during the backup process.

$ tag="Same value which was specified during backup process"
$ curl -X POST "atlas-elasticsearch-master.atlas.svc.cluster.local:9200/_snapshot/atlan_s3_repository/atlan_nightly_backup_$tag/_restore?pretty"

You'll get output something like this:

{
"acknowledged" : true
}

STEP 5: Wait for all the indices to turn green

You can check this using the command below:

$ curl -s -X GET "atlas-elasticsearch-master.atlas.svc.cluster.local:9200/_cat/indices"

Once all the indices are in the green state, the restore process is completed. ๐ŸŽ‰

Cassandra

A utility called "Cain" will be used to backup and restore Cassandra. This utility performs backup and restoration functionalities for Cassandra in Kubernetes. You can find out more about the tool here.

Backing up Atlan with Cassandra

We have a Kubernetes cronjob running, which uses a "nuvo/cain" image to launch a pod, create a snapshot of the Cassandra cluster and data, and then upload that snapshot to the specified S3 bucket under the specified path (e.g. s3://development-12451532/backup/cassandra/).

Follow the steps below to create a backup/snapshot of Cassandra manually using Cain.

STEP 1: Launch a pod and get shell access for the new pod

Make sure to use nuvo/cain as the container image.

$ kubectl run -it --image nuvo/cain --serviceaccount atlas-cassandra --env AWS_REGION="s3-bucket-region" -n atlas cassandra-backup -- sh

STEP 2: Create a backup of the currently running Cassandra cluster

Once the pod is in a running state and you are inside it, use this command to create the backup:

$ cain backup -n atlas -l app=cassandra -k atlas --dst s3://s3-bucket-name/backup/cassandra -c atlas-cassandra

Now wait for the backup to get completed. In the end you'll get the backup-id which will be used during restoration process. You can also get this backup-id from s3 bucket under the path "s3://s3-bucket-name/backup/cassandra/atlas/cassandra/atlas/78ce5c/" There will be lot of folders, the name of the folders will be the backup-id.

STEP 3: The backup process is now completed!

It's time to exit the pod and delete the backup pod.

$ kubectl delete pod cassandra-backup -n atlas

Restoring Cassandra

Follow the steps below to restore Cassandra.

STEP 1: Find the latest backup tag to be restored

To find the backup tag, you will need access to your AWS GUI account.

  1. In the AWS GUI, go to the S3 section.

  2. Search for your bucket with the Cassandra snapshot stored.

  3. Inside "Bucket", go to the path "backup/cassandra/atlas/cassandra/atlas/78ce5c/".

  4. Find the last created folder. This will be the latest tag to restore.

STEP 2: Spin up a pod and simultaneously execute into the pod

Make sure to use the image "nuvo/cain" and the service account "atlas-cassandra".

$ kubectl run --image nuvo/cain --serviceaccount atlas-cassandra --env AWS_REGION=ap-south-1 -n atlas cassandra-restoration -it -- sh

STEP 3: Perform the restoration

$ cain restore --src 's3://bucket-name/backup/cassandra/atlas/cassandra' -n atlas -l app=cassandra -t "latest backup tag" -k atlas -c atlas-cassandra --schema 78ce5c

The restoration will start and will be finished in a few minutes. ๐ŸŽ‰

STEP 4: Exit the pod and delete the restoration pod

$ kubectl delete pod cassandra-restoration -n atlas

Postgres

Backing up Atlan with Postgres

To create a backup of the Postgres database, use the custom Docker image. It takes the dump of all the databases in Postgres, compresses the dump file to reduces its size, and uploads the compressed file to a specified S3 bucket.

๐Ÿ‘€ Note: This image runs using the Kubernetes cronjob, which runs at a specified time every day.

Here are the steps to create a backup of Postgres manually.

STEP 1: Create a Kubernetes job to perform the backup process

Here is the manifest file to create the backup job. Save this file in your local system (e.g. "postgresql-backup-job.yaml").

apiVersion: batch/v1
kind: Job
metadata:
name: postgresql-backup-job
namespace: postgresql
spec:
backoffLimit: 2
completions: 1
parallelism: 1
template:
spec:
containers:
- args:
- /script.sh
command:
- bash
envFrom:
- secretRef:
name: postgres-backup-secret
image: proxy.replicated.com/proxy/lite/ghcr.io/atlanhq/ postgres-backup:1.0.4
imagePullPolicy: IfNotPresent
name: postgres-backup
resources: {}
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
dnsPolicy: ClusterFirst
imagePullSecrets:
- name: kotsadm-replicated-registry
restartPolicy: OnFailure
schedulerName: default-scheduler
securityContext: {}
terminationGracePeriodSeconds: 30

STEP 2: Apply this job in the Kubernetes cluster

$ kubectl apply -f ./postgresql-backup-job.yaml

STEP 3: Track the progress of the backup

You can do this by checking the pod's logs.

You will see that a pod is created in the postgresql namespace.

$ kubectl get pods -n postgresql
$ kubectl logs "backup-pod-name" -n postgresql -f

STEP 4: All done!

Once the pod is completed, your backup is also complete.

Restoring data from Postgres

Follow the steps below to restore Postgres data in a new stack.

Prerequisites

Here's what you'll need to restore Postgres data:

  • An Atlan instance that is up and running.

  • IAM permission to access AWS S3 and AWS CloudFormation.

  • Kubectl access to both the existing cluster and new cluster where the restoration is being performed.

STEP 1: Download the Postgres backup from S3

  • Go to the "CloudFormation" dashboard, and search for the stack name of the desired instance (e.g. "infrapoc").

  • Click on the nested template with "S3DatalakeBucket" in the name (e.g. "infrapoc-S3DatalakeBucket-112233LLCC").

  • Go to the Resources section, and click on the physical ID of the S3 data lake bucket.

  • Navigate to the Postgres backup folder, then navigate to "postgres".

  • Select the latest backup object by going to "Actions" then "Download".

STEP 2: Retrieve the Postgres password from the old cluster

  • Execute the following commands in the old cluster:

    $ kubectl exec -it postgresql-postgresql-master-0 -n postgresql -- sh
    $ printenv
  • Make sure to copy the POSTGRES_PASSWORD and store it securely.

STEP 3: Copy the backup to the new Postgres pod

  • Check whether the Kubectl cluster context is for the new cluster.

  • Navigate to the directory in your local machine where the backup has been downloaded (e.g. "Downloads").

  • Unzip the downloaded file.

    $ gzip -d postgres-backup-XXXXXXXXX.gz
  • Execute the following command:

    $ kubectl cp ~/Downloads/postgres-backup- postgresql-postgresql-master-0:/tmp -n postgresql
  • After removing all the interfaces manually, you can delete the VPC.

STEP 4: Access Postgres in the current cluster and change the password

  • Execute the following commands in the new cluster:

    $ kubectl exec -it postgresql-postgresql-master-0 -n postgresql -- sh $ printenv
  • Copy the POSTGRES_PASSWORD for this new cluster:

    $ psql -U postgres
  • Paste the password you copied earlier from the old cluster.

  • Change the password.

    ALTER USER postgres WITH PASSWORD 'old_password';

STEP 5: Delete the old databases

  • Deactivate active sessions of the following databases: Atlan, Auxiliary, Caspian, Keycloak, and Ranger.

    $ kubectl scale statefulset keycloak -n keycloak --replicas=0
    $ kubectl scale statefulset ranger -n ranger --replicas=0
    โ€‹
    $ for apps in hermes auxiliary caspian user-service atlas;do kubectl scale deployment $apps -n $apps --replicas=0;done
  • Start dropping the databases:

    drop database atlan;
    drop database auxiliary;
    drop database caspian;
    drop database keycloak;
    drop database ranger;
  • If any session is still active for a particular database, execute the following:

    SELECT pg_terminate_backend (pid) FROM pg_stat_activity WHERE pg_stat_activity.datname = 'DB_NAME';
  • Drop the remaining databases, if any.

STEP 6: Restore the backup

  • Create these new databases: Atlan, Auxiliary, Caspian, Keycloak, and Ranger.

    create database atlan;
    create database auxiliary;
    create database caspian;
    create database keycloak;
    create database ranger;
  • To migrate the backup, execute the following:

    $ psql -U postgres -d caspian -f /tmp/postgres-backup-XXXXXX

You have successfully restored the backups to the new clusters! ๐ŸŽ‰

Kots

Reconfiguring Kots with a new value

Now that you've launched a new stack, it's time to configure it. Follow the steps below to reconfigure Kots.

  • Go to the release portal of the existing stack, and get the value for "Keycloak Client Secret".

  • Get Kubectl access to the new stack.

  • Run this command:

$ kubectl kots download --slug lite --decrypt-password-values true --namespace kots --dest ./dist
  • Open the file under ./dist/lite/upstream/userdata/config.yaml.

  • Replace the value for Keycloak Client Secret wherever it is used.

  • Replace the value of the new password with the password of the old stack, wherever it is used.

  • Run this command:

$ kubectl kots upload --namespace kots --slug lite ./dist/lite
  • Go to the release portal of the new stack, and deploy the recently created release.

Verifying the product

  • Check if all the pods are up:

    $ kubectl get pods -A
  • Verify the product by logging in.

If you face any issues in following these steps, you can always reach out to us at [email protected].