Configure rack awareness (AP)
This page describes how to implement rack awareness in namespaces configured for Available and Partition-tolerant (AP) mode.
Rack awareness stores replicas of records in separate hardware failure groups which are defined by their rack-id
.
How it works
The following examples illustrate how rack awareness operates.
- When configuring three racks with replication factor of 3 (RF3), each rack receives one replica for a given partition.
- The three replicas are on separate racks. The distribution of specific nodes is specified in the succession list.
- If you lose a rack, the number of replicas is eventually restored to match the value of your
replication-factor
. For instance:- your server is configured for RF3
- you reduce the number of racks from 3 racks to 2
- one rack hosts the master
- the other rack hosts one replica
- and the third replica moves to one of the two racks.
- To avoid having data missing from the cluster, configure rack awareness to use multiple racks defined by
rack-id
. - Starting with Database 7.2, the active rack feature dynamically designates a particular
rack-id
to hold all master partition copies.
Partition distribution
Masters are always evenly distributed across each node in the cluster, regardless of the configuration in prefer-uniform-balance
, even when the number of racks is greater than the replication-factor
(RF).
In these cases, racks do not have a copy of each partition.
Configure the node-id
Partition distribution is based on the cluster’s node-id
. By default the node ID is derived from the MAC address. Node IDs can also be assigned in the node configuration to make it easier to identify the cluster machines in admin operations. In a running cluster, node IDs can be changed one node at a time in a rolling fashion across a cluster.
Specify the node ID inside the service
stanza of aerospike.conf
, as shown in the following example.
service { <...> node-id a1 <...>}
Imbalanced racks
Imbalanced racks are racks with different numbers of nodes. The master partition and replicas are distributed to specific racks, however if the RF is configured higher than the current number of racks, the extra replicas are distributed randomly.
Imbalanced clusters
If a single node goes down, the cluster is temporarily imbalanced until the node is restored. This imbalance does not cause service interruptions. The cluster continues uninterrupted. Once the node restarts, the cluster automatically rebalances. The imbalance in the general load on the nodes across racks depends on the nature of the workload.
Statically assign a rack in an AP namespace
You can configure rack awareness at the namespace level. To assign nodes to the same rack, specify the same rack-id
for these nodes.
namespace { ... rack-id 1 ...}
Dynamically assign a rack in an AP namespace
You can implement rack awareness for an existing cluster dynamically. You must persist your changes in aerospike.conf
to protect against a rollback due to restart. This is the best practice for updating aerospike.conf
for any dynamic change.
-
On each node use
manage config
to change therack-id
.Terminal window asadm -e "enable; manage config namespace NAMESPACE-NAME param rack-id to 1 with HOST" -
Add the
rack-id
to the namespace stanza inaerospike.conf
to ensure that the configuration persists through any restarts. -
Trigger a rebalance of the cluster to engage migrations with the new
rack-id
configurations.Terminal window asadm --enable -e "manage recluster"
Display rack information
Use the following command to display the rack information.
asadm -e "show racks"~~~~~~~~~~~~~~~~Racks (2021-10-21 20:33:28 UTC)~~~~~~~~~~~~~~~~~Namespace|Rack| Nodes | ID|bar |2 |BB9040016AE4202, BB9020016AE4202, BB9010016AE4202test |1 |BB9040016AE4202, BB9010016AE4202Number of rows: 2
For the test
namespace, rack-id 1 includes nodes BB9040016AE4202, BB9020016AE4202, BB9010016AE4202. rack-id 2 includes nodes BB9040016AE4202, BB9010016AE4202.
Rack aware reads
Database clients can read from the servers in the closest rack or zone on a preferential basis. This can lower latency, increase stability, and significantly reduce traffic charges by limiting cross-availability-zone traffic.
This feature is available in Java, C, C#, Go and Python clients.
Set up rack aware reads
-
Set up clusters in logical racks. See Statically assign a rack in an AP namespace.
-
Set the
rackId
andrackAware
flags in theClientPolicy
object. Use the rack ID specified in the nodes for the associated availability zone (AZ) where that application is running. The following example uses Java to demonstrate how to enable rack awareness. Commands are similar in other clients.ClientPolicy clientPolicy = new ClientPolicy();clientPolicy.rackId = <RACK ID>;clientPolicy.rackAware = true; -
When the application is connected, set 2 additional parameters in the Policy associated with the reads to be rack aware.
Policy policy = new Policy();policy.readModeAP = ReadModeAP.ALL;policy.replica = Replica.PREFER_RACK;readModeAP.ALL // indicates that all replicas can be consulted.policy.replica = Replica.PREFER_RACK // indicates that the record in the same rack should be accessed if possible.
Designate an active rack
When an active rack is designated, statically or dynamically, a particular rack ID will hold all master partition copies. For active rack to take effect, all nodes must agree on the same active rack, and the number of racks must be at most equal to the configured replication-factor
.
-
Setting the active rack configuration dynamically must be followed by a recluster.
-
active-rack 0
disables the feature. This means that you can’t designaterack_id
0 as the active rack. -
Changing the rack ID on all nodes with
rack_id 0
to a new value that is distinct from any other racks does not cause any migrations.
Statically enable active rack in aerospike.conf
Edit aerospike.conf
.
namespace ns-name { ... rack-id 1 active-rack 2 # 2 may be the same as 1 ...}
Dynamically enable active rack using asadm
-
Designate rack-id ‘1’ to be the active rack.
Admin> enableAdmin+> manage config namespace ns-name param active-rack to 1~Set Namespace Param active-rack to 1~Node|Response172.22.22.1:3000|ok172.22.22.2:3000|ok172.22.22.3:3000|ok172.22.22.4:3000|ok172.22.22.5:3000|ok172.22.22.6:3000|okNumber of rows: 6 -
Issue the recluster command to apply the active rack configuration.
Admin+> manage reclusterSuccessfully started recluster -
Verify that migrations are complete.
Admin+> info namespace object~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~Namespace Object Information (2024-07-26 17:10:20 UTC)~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~Namespace| Node|Rack| Repl|Expirations| Total|~~~~~~~~~~Objects~~~~~~~~~~|~~~~~~~~~Tombstones~~~~~~~~|~~~~Pending~~~~| | ID|Factor| |Records| Master| Prole|Non-Replica| Master| Prole|Non-Replica|~~~~Migrates~~~| | | | | | | | | | | | Tx| Rxap |172.22.22.1:3000| 1| 2| 0.000 |0.000 |0.000 |0.000 | 0.000 |0.000 |0.000 | 0.000 |0.000 |0.000ap |172.22.22.2:3000| 1| 2| 0.000 |0.000 |0.000 |0.000 | 0.000 |0.000 |0.000 | 0.000 |0.000 |0.000ap |172.22.22.3:3000| 1| 2| 0.000 |0.000 |0.000 |0.000 | 0.000 |0.000 |0.000 | 0.000 |0.000 |0.000ap |172.22.22.4:3000| 0| 2| 0.000 |0.000 |0.000 |0.000 | 0.000 |0.000 |0.000 | 0.000 |0.000 |0.000ap |172.22.22.5:3000| 0| 2| 0.000 |0.000 |0.000 |0.000 | 0.000 |0.000 |0.000 | 0.000 |0.000 |0.000ap |172.22.22.6:3000| 0| 2| 0.000 |0.000 |0.000 |0.000 | 0.000 |0.000 |0.000 | 0.000 |0.000 |0.000ap | | | | 0.000 |0.000 |0.000 |0.000 | 0.000 |0.000 |0.000 | 0.000 |0.000 |0.000Number of rows: 6 -
Verify only one rack holds master (or Primary) partition.
Admin+> show pmap~~~~~~~~~~~~~Partition Map Analysis (2024-07-26 17:10:25 UTC)~~~~~~~~~~~~~Namespace| Node| Cluster Key|~~~~~~~~~~~~Partitions~~~~~~~~~~~~| | |Primary|Secondary|Unavailable|Deadap |172.22.22.1:3000|25C9430BCEE5| 1365| 0| 0| 0ap |172.22.22.2:3000|25C9430BCEE5| 1365| 0| 0| 0ap |172.22.22.3:3000|25C9430BCEE5| 1366| 0| 0| 0ap |172.22.22.4:3000|25C9430BCEE5| 0| 1365| 0| 0ap |172.22.22.5:3000|25C9430BCEE5| 0| 1365| 0| 0ap |172.22.22.6:3000|25C9430BCEE5| 0| 1366| 0| 0ap | | | 4096| 4096| 0| 0Number of rows: 6Note that all master (or Primary) partitions are now on the nodes that we designated
rack-id 1
.
Where to next?
- Configure service, fabric, and info sub-stanzas which defines what interface to use for application to node communication.
- Configure heartbeat sub-stanza which defines what interface to use for intracluster communications.
- Learn more about Rack aware architecture.