Set up a new cluster

FaunaDB Hybrid clusters are made up of at least one replica with at least one node. A replica is a grouping of nodes with data partitioned across them. The nodes within a replica will ask one another for data before asking a node that might be further away.

So you’d want to setup replicas across regions in order to maintain availability when another region goes down. You’d also set them up close to customer bases so they could query local replicas rather than having to cross the internet for data.

A FaunaDB Hybrid cluster can be used for entirely new data or as the target of a snapshot restore.

In this section we walk you through the steps of setting up a basic FaunaDB Hybrid cluster: three replicas with a single node each. If you are setting up your cluster in AWS EC2, read this first.

Single nodes, for development or testing, are free to use indefinitely. Clusters involving 2 or more nodes are free to use for 90 days. For more information, see our Pricing page.

Dependencies

Before we get started, verify that the user running the FaunaDB service has read/write access to the config file (faunadb.yml), log path (log_path variable in faunadb.yml), and storage paths (storage_data_path variable in faunadb.yml).

You will need the following installed and set up:

You will also need curl (or something equivalent) for inspecting the public API.

FaunaDB Hybrid uses the following ports by default:

  • HTTP port 8443 for API requests (network_coordinator_http_port)

  • HTTP port 8444 for admin requests (network_admin_http_port)

  • Storage ports 7001/7501 for cross-replica traffic (network_peer_port/network_peer_secure_port), and

  • Note that if you set stats_host, stats will be written to port 8125 unless changed with stats_port.

Set up a new cluster

To set up a new FaunaDB Hybrid, we follow these steps:

Determine network topology

Start by deciding on the basic network topology and replicas that should make up your cluster. For a FaunaDB Hybrid cluster, you should have a minimum of three replicas. Each replica contains a full set of the cluster’s data.

Depending on your traffic expectations, either each replica should contain the same number of nodes in order to balance load and storage evenly across all nodes, or add additional nodes to high-traffic replicas as needed.

The rest of this section will assume three replicas named replica_1, replica_2, and replica_3.

Set up configuration file

Create a config file for each node and put it in /etc/faunadb.yml. You can see an example faunadb.yml in your FaunaDB Hybrid package.

At minimum, you will need to specify:

  • replica_name: This is the name used to group any additional nodes within a replica, for example: 'replica_1'.

  • network_broadcast_address: the node’s IP address

  • network_listen_address: The interface address that FaunaDB binds to for incoming requests.

  • auth_root_key: The root admin key for the FaunaDB Query API.

If your environment requires encrypted peer-to-peer communication, go configure peer-to-peer encryption before moving to the next step.

Start the first node

Continuing on the first node, navigate to the FaunaDB install directory and start FaunaDB by running:

faunadb -c »Config filename«

For example:

faunadb -c /etc/faunadb.yml

Then initialize your new FaunaDB Hybrid cluster:

faunadb-admin init

Start other nodes

Once the first node is up, you will need to start FaunaDB by running the following on the other two nodes:

faunadb -c »Config filename«

For example:

faunadb -c /etc/faunadb.yml

Then join each node to the first node so that the nodes get the data they need:

faunadb-admin join »IP of first node«

Start the replicas

On the first node, set the cluster replication to include all initial replicas:

faunadb-admin update-replica data+log »Replica 1 name« »Replica 2 name« »Replica 3 name«

For example, if we use our three replicas (replica_1, replica_2, and replica_3):

faunadb-admin update-replica data+log replica_1 replica_2 replica_3

Using replica type data+log ensures that these replicas — in addition to replicating data — also participate in the distributed transaction log. See the Managing Replicas section for more information.

Use the status admin command to track the progress of replicating data to the new nodes:

faunadb-admin status

Verify

Verify the cluster is up and running by using the ping endpoint:

curl http://localhost:8443/ping
{ "resource": "Scope write is OK" }

If you do not receive the 200 OK response, get in touch with us to troubleshoot what went wrong.

Once your new FaunaDB Hybrid cluster has been set up, you should begin using your process manager of choice rather than faunadb to run and manage your day to day operations.

Set up a cluster in AWS EC2

We recommend using EC2 instances with locally attached, SSD storage. For production deployments, we recommend 8 cores and 30GB RAM with M3 or C3, which corresponds to the largest instance size: m3.2xlarge or c3.2xlarge +.

Since AWS is organized as multiple regions that are operationally independent of each other, communication between regions crosses the open internet and therefore peer-to-peer encryption between regions should be used.

For increased IO performance, you can choose to enable software RAID-0 if your instance has multiple local storage devices. Refer to your OS’s documentation for instructions on setting up software RAID.

FaunaDB Hybrid is sensitive to I/O performance.

We recommended a deployment to 3 regions. This allows the cluster to continue to be available to clients if any single region becomes unavailable.

Was this article helpful?

We're sorry to hear that.
Tell us how we can improve! documentation@fauna.com

Thank you for your feedback!