Getting Started: Deploy locally

We ❤️‍🔥 Dev teams. data.all provides a local developer experience, allowing the data.all backend to run locally. Local development experience allows teams to implement, test and release new features quickly and easily.

Prerequisites

Depending on your OS, you will need the following:

Python >= 3.7
Docker, Docker-compose >= 2
node
Yarn
AWS credentials as environment variables.

1. Clone data.all code and run docker compose

data.all is fully dockerized with docker-compose, and can be fully run from your local computer. The first step is to clone the repo.

git clone https://github.com/data-dot-all/dataall.git --branch v2.6.1

With docker compose we orchestrate the build of 5 containers: frontend, db, graphql, cdkproxy, opensearch. You can check the ports assigned to each container in the docker-compose.yaml file at the root level of the repo.

cd dataall
export UID && docker-compose up

Note: We export UID to ensure the docker user created to run the container has the correct permissions to read from the mounted file systems for the locally deployed data.all.

dockercompose

🎉 Congratulations 🎉 Now you can access the UI in http://localhost:8080

2. Set AWS credentials

You can access the UI, but until you set AWS credentials you won’t be able to create Environments, Datasets or any other data.all stack.

We have configured Docker to read the AWS credentials specified in the ‘default’ profile that you have configured in your local machine, EC2 or Cloud9 or wherever you are working.

You can use credentials for any AWS account, but we recommend you to use an AWS account that you currently use as infrastructure account of data.all. This is how your .aws/credentials file should look like:

[default]
aws_access_key_id=XXXXXXX
aws_secret_access_key=XXXXX
aws_session_token=XXXXXXX

3. Create parameters in credentials AWS account

Data.all reads some parameters from SSM, thus we need to create them in the selected AWS account to fully work with data.all.

Create a dkrcompose externalId for the pivot role parameter in the credentials account with the following command.

aws ssm put-parameter \
    --name "/dataall/dkrcompose/pivotRole/externalId" \
    --value "randomvalue1234" \
    --type String \

And the dkrcompose pivot role name parameter.

aws ssm put-parameter \
    --name "/dataall/dkrcompose/pivotRole/pivotRoleName" \
    --value "dataallPivotRole" \
    --type String \

Finally, set the value for the enable pivot role auto-creation

aws ssm put-parameter \
    --name "/dataall/dkrcompose/pivotRole/enablePivotRoleAutoCreate" \
    --value "False" \
    --type String \

4. Linking environments

As it is explained in the architecture section, data.all communicates with the environment accounts through 2 mechanisms: 1) CDK bootstrap --trust to the data.all central account: the AWS account chosen to link the environment needs to be bootstraped trusting the AWS account that you used in step 2 and 3.

2) IAM Pivot Role trusting the data.all account used in 2 and 3. When we create the pivot role stack in the environment account, use the account specified in 2 and 3 and the externalId and pivotRole name defined in 3.

Going Further

To learn more about how to get started using the data.all UI and get a holistic understanding of supported features please check out the Getting Started Workshop and the data.all UserGuide (available as PDF in the open-source repository or via the data.all UI).

Table of Contents