Configure rack awareness (AP)

This page describes how to implement rack awareness in namespaces configured for Available and Partition-tolerant (AP) mode.

Rack awareness stores replicas of records in separate hardware failure groups which are defined by their rack-id.

How it works

The following examples illustrate how rack awareness operates.

When configuring three racks with replication factor of 3 (RF3), each rack receives one replica for a given partition.
- The three replicas are on separate racks. The distribution of specific nodes is specified in the succession list.
If you lose a rack, the number of replicas is eventually restored to match the value of your replication-factor. For instance:
- your server is configured for RF3
- you reduce the number of racks from 3 racks to 2
- one rack hosts the master
- the other rack hosts one replica
- and the third replica moves to one of the two racks.
To avoid having data missing from the cluster, configure rack awareness to use multiple racks defined by rack-id.
Starting with Database 7.2, the active rack feature dynamically designates a particular rack-idto hold all master partition copies.

Partition distribution

Masters are always evenly distributed across each node in the cluster, regardless of the configuration in prefer-uniform-balance, even when the number of racks is greater than the replication-factor (RF).
In these cases, racks do not have a copy of each partition.

Configure the node-id

Partition distribution is based on the cluster’s node-id. By default the node ID is derived from the MAC address. Node IDs can also be assigned in the node configuration to make it easier to identify the cluster machines in admin operations. In a running cluster, node IDs can be changed one node at a time in a rolling fashion across a cluster.

Specify the node ID inside the service stanza of aerospike.conf, as shown in the following example.

service {
    <...>
    node-id a1
    <...>
}

Imbalanced racks

Imbalanced racks are racks with different numbers of nodes. The master partition and replicas are distributed to specific racks, however if the RF is configured higher than the current number of racks, the extra replicas are distributed randomly.

Imbalanced clusters

If a single node goes down, the cluster is temporarily imbalanced until the node is restored. This imbalance does not cause service interruptions. The cluster continues uninterrupted. Once the node restarts, the cluster automatically rebalances. The imbalance in the general load on the nodes across racks depends on the nature of the workload.

Statically assign a rack in an AP namespace

You can configure rack awareness at the namespace level. To assign nodes to the same rack, specify the same rack-id for these nodes.

namespace {
    ...
    rack-id 1
    ...
}

Dynamically assign a rack in an AP namespace

You can implement rack awareness for an existing cluster dynamically. You must persist your changes in aerospike.conf to protect against a rollback due to restart. This is the best practice for updating aerospike.conf for any dynamic change.

On each node use manage config to change the rack-id.

asadm -e "enable; manage config namespace NAMESPACE-NAME param rack-id to 1 with HOST"

Add the rack-id to the namespace stanza in aerospike.conf to ensure that the configuration persists through any restarts.
Trigger a rebalance of the cluster to engage migrations with the new rack-id configurations.
Terminal window
```
asadm --enable -e "manage recluster"
```

Display rack information

Use the following command to display the rack information.

asadm -e "show racks"
~~~~~~~~~~~~~~~~Racks (2021-10-21 20:33:28 UTC)~~~~~~~~~~~~~~~~~
Namespace|Rack|                                            Nodes
         |  ID|
bar      |2   |BB9040016AE4202, BB9020016AE4202, BB9010016AE4202
test     |1   |BB9040016AE4202, BB9010016AE4202
Number of rows: 2

For the test namespace, rack-id 1 includes nodes BB9040016AE4202, BB9020016AE4202, BB9010016AE4202. rack-id 2 includes nodes BB9040016AE4202, BB9010016AE4202.

Rack aware reads

Database clients can read from the servers in the closest rack or zone on a preferential basis. This can lower latency, increase stability, and significantly reduce traffic charges by limiting cross-availability-zone traffic.

This feature is available in Java, C, C#, Go and Python clients.

Set up rack aware reads

Set up clusters in logical racks. See Statically assign a rack in an AP namespace.
Set the rackId and rackAware flags in the ClientPolicy object. Use the rack ID specified in the nodes for the associated availability zone (AZ) where that application is running. The following example uses Java to demonstrate how to enable rack awareness. Commands are similar in other clients.
```
ClientPolicy clientPolicy = new ClientPolicy();
clientPolicy.rackId = <RACK ID>;
clientPolicy.rackAware = true;
```

When the application is connected, set 2 additional parameters in the Policy associated with the reads to be rack aware.

Policy policy = new Policy();
policy.readModeAP = ReadModeAP.ALL;
policy.replica = Replica.PREFER_RACK;
readModeAP.ALL // indicates that all replicas can be consulted.
policy.replica = Replica.PREFER_RACK // indicates that the record in the same rack should be accessed if possible.

Designate an active rack

When an active rack is designated, statically or dynamically, a particular rack ID will hold all master partition copies. For active rack to take effect, all nodes must agree on the same active rack, and the number of racks must be at most equal to the configured replication-factor.

Setting the active rack configuration dynamically must be followed by a recluster.
active-rack 0 disables the feature. This means that you can’t designate rack_id 0 as the active rack.
Changing the rack ID on all nodes with rack_id 0 to a new value that is distinct from any other racks does not cause any migrations.

Statically enable active rack in aerospike.conf

Edit aerospike.conf.

namespace ns-name {
    ...
    rack-id 1
    active-rack 2  # 2 may be the same as 1
    ...
}

Dynamically enable active rack using asadm

Designate rack-id ‘1’ to be the active rack.

Admin> enable
Admin+> manage config namespace ns-name param active-rack to 1
~Set Namespace Param active-rack to 1~
        Node|Response
172.22.22.1:3000|ok
172.22.22.2:3000|ok
172.22.22.3:3000|ok
172.22.22.4:3000|ok
172.22.22.5:3000|ok
172.22.22.6:3000|ok
Number of rows: 6

Issue the recluster command to apply the active rack configuration.
```
Admin+> manage recluster
Successfully started recluster
```

Verify that migrations are complete.

Admin+> info namespace object
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~Namespace Object Information (2024-07-26 17:10:20 UTC)~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Namespace|            Node|Rack|  Repl|Expirations|  Total|~~~~~~~~~~Objects~~~~~~~~~~|~~~~~~~~~Tombstones~~~~~~~~|~~~~Pending~~~~
      |                |  ID|Factor|           |Records| Master|  Prole|Non-Replica| Master|  Prole|Non-Replica|~~~~Migrates~~~
      |                |    |      |           |       |       |       |           |       |       |           |     Tx|     Rx
ap       |172.22.22.1:3000|   1|     2|    0.000  |0.000  |0.000  |0.000  |    0.000  |0.000  |0.000  |    0.000  |0.000  |0.000
ap       |172.22.22.2:3000|   1|     2|    0.000  |0.000  |0.000  |0.000  |    0.000  |0.000  |0.000  |    0.000  |0.000  |0.000
ap       |172.22.22.3:3000|   1|     2|    0.000  |0.000  |0.000  |0.000  |    0.000  |0.000  |0.000  |    0.000  |0.000  |0.000
ap       |172.22.22.4:3000|   0|     2|    0.000  |0.000  |0.000  |0.000  |    0.000  |0.000  |0.000  |    0.000  |0.000  |0.000
ap       |172.22.22.5:3000|   0|     2|    0.000  |0.000  |0.000  |0.000  |    0.000  |0.000  |0.000  |    0.000  |0.000  |0.000
ap       |172.22.22.6:3000|   0|     2|    0.000  |0.000  |0.000  |0.000  |    0.000  |0.000  |0.000  |    0.000  |0.000  |0.000
ap       |                |    |      |    0.000  |0.000  |0.000  |0.000  |    0.000  |0.000  |0.000  |    0.000  |0.000  |0.000
Number of rows: 6

Verify only one rack holds master (or Primary) partition.

Admin+> show pmap
~~~~~~~~~~~~~Partition Map Analysis (2024-07-26 17:10:25 UTC)~~~~~~~~~~~~~
Namespace|            Node| Cluster Key|~~~~~~~~~~~~Partitions~~~~~~~~~~~~
      |                |            |Primary|Secondary|Unavailable|Dead
ap       |172.22.22.1:3000|25C9430BCEE5|   1365|        0|          0|   0
ap       |172.22.22.2:3000|25C9430BCEE5|   1365|        0|          0|   0
ap       |172.22.22.3:3000|25C9430BCEE5|   1366|        0|          0|   0
ap       |172.22.22.4:3000|25C9430BCEE5|      0|     1365|          0|   0
ap       |172.22.22.5:3000|25C9430BCEE5|      0|     1365|          0|   0
ap       |172.22.22.6:3000|25C9430BCEE5|      0|     1366|          0|   0
ap       |                |            |   4096|     4096|          0|   0
Number of rows: 6

Note that all master (or Primary) partitions are now on the nodes that we designated rack-id 1.

Where to next?

Configure service, fabric, and info sub-stanzas which defines what interface to use for application to node communication.
Configure heartbeat sub-stanza which defines what interface to use for intracluster communications.
Learn more about Rack aware architecture.