This function answers a need I have had multiple times, where I want to replicate data from an Amazon Kinesis Stream to another account or another region for processing, or to another environment such as development.
This AWS Lambda function can be used to propagate incoming messages from Amazon Kinesis Streams or Amazon DynamoDB Streams to other services (Amazon SNS, Amazon SQS, Amazon Elasticsearch Service, Amazon Kinesis Streams, Amazon Kinesis Firehose, AWS IoT, AWS Lambda, Amazon ElastiCache for Memcached and Redis), regions or accounts. This function generates metrics that will be published to Amazon CloudWatch Metrics, in the Custom/FanOut namespace.
This function can be run in the 'public' AWS Lambda mode (and should be the default for most services), or inside an Amazon Virtual Private Cloud (VPC) if you use Amazon ElastiCache (Redis or memcached).
As the configuration data resides in Amazon DynamoDB, and because the function sends metrics via Amazon CloudWatch, the function must have Internet access. To have Internet access, an AWS Lambda function running inside an Amazon VPC must reside in a private subnet with a route through a NAT Gateway (either a self-managed NAT instance or an Amazon Managed NAT Gateway) for Internet. This limitation exists because AWS Lambda functions in an Amazon VPC don't have a public IP Address.
Here is a list of terms used in this documentation:
fanout functionrefers to the AWS Lambda function running the provided code;Sourcerefers to an AWS Kinesis Stream or Amazon DynamoDB Stream on which thefanout functionhas an event source mapping and that will generate events;Source Accountrefers to the AWS Account in which thefanout functionis executing;Source Regionrefers to the region in which thefanout functionis executing;Targetrefers to a destination (see target types for details) to which thefanout functionwill send records after processing;Target Accountrefers to the AWS Account in which atargetresides;Target Regionrefers to the region in which atargetresides.
The fanout function maps sourcesto targets.
Sources can currently be:
- An Amazon Kinesis Stream (specified by ARN)
- An Amazon DynamoDB Stream (specified by ARN)
Targets have a specific type defined, and a destination. Currently the allowed types and destination formats are the following:
snsfor specifying an Amazon Simple Notification Service (SNS) Topic- ARN of the target Amazon SNS Topic
sqsfor specifying an Amazon Simple Queue Service (SQS) Queue- URL of the target Amazon SQS Queue
esfor specifying an Amazon Elasticsearch Domain- Composite string containing the FQDN of the target Amazon Elasticache Service Domain endpoint, followed by
#then the storage specification<doctype>/<index>
- Composite string containing the FQDN of the target Amazon Elasticache Service Domain endpoint, followed by
kinesisfor specifying an Amazon Kinesis Stream- name of the target Amazon Kinesis Stream
firehosefor specifying an Amazon Kinesis Firehose Delivery Stream- name of the target Amazon Kinesis Firehose Delivery Stream
iotfor specifying an AWS IoT MQTT topic- Composite string containing the FQDN of the target Amazon IoT endpoint (specific per account / region), followed by
#then the MQTT topic name
- Composite string containing the FQDN of the target Amazon IoT endpoint (specific per account / region), followed by
lambdafor specifying an AWS Lambda Function- name of the target AWS Lambda Function
memcachedfor specifying an Amazon ElastiCache Memcached Cluster- FQDN of the target Amazon ElastiCache Memcached Cluster endpoint
redisfor specifying an Amazon ElastiCache Redis Replication Group- FQDN of the target Amazon ElastiCache Redis Replication Group primary endpoint
Each target is defined by a set of parameters, stored in an Amazon DynamoDB table. The default name of the table is derived from the function <function-name>Targets. Here are the properties for this table:
sourceArn(String) [required]: the ARN of the event source (Amazon Kinesis Stream or Amazon DynamoDB Stream) (Table Hash Key)id(String) [required]: the identifier of the fan-out target (Table Range Key)type(String) [required]: the type of destination for the fan-out targetdestination(String) [required]: the destination of the messages (as defined in the Targets section)active(Boolean) [required]: indicates if this target is active or notrole(String) [optional]: for cross-account roles: you can specify the role ARN that will be assumedexternalId(String) [optional]: for cross-account roles: you can specify an external Id for the STS:AssumeRole callregion(String) [optional]: for cross-region calls, you can specify the region namecollapse(String) [optional]: for AWS IoT, Amazon SQS and Amazon SNS, defines how the messages should be collapsed, if at all (none, JSON, concat, concat-b64; default: JSON)parallel(Boolean) [optional]: indicates if we should process sending these messages in parallel. Warning: this may break in-shard ordering for Amazon Kinesis (default true)convertDDB(Boolean) [optional]: for Amazon DynamoDB Streams messages, converts the DDB objects to plain Javascript objectsdeaggregate(Boolean) [optional]: for Amazon Kinesis Streams messages, deserializes KPL (protobuf-based) messages
For sns, sqs, es, kinesis, firehose, iot and lambda:
- you can specify a
target accountand atarget regionfor yourtargets. - you can either run the
fanout functionas a 'public' AWS Lambda function or in an Amazon VPC with NAT Gateway
For memcached and redis:
- you will need to run the
fanout functionas an AWS Lambda function inside an Amazon VPC; - the
targetscan only reside in the same Amazon VPC where yourfanout functionresides, as such in thesource regionandsource account.
You can send the records to a target account. This feature leverages the STS:AssumeRole API to allow cross-account access.
To activate this feature, you have to specify the role property in the configuration. This property will contain the ARN of the AWS IAM Role from the target account to be used when publishing the data.
If you send data to an account that you don't own, you should specify the externalId property that is used to further limit the access to sts:AssumeRole.
To activate this feature, you need to configure a policy in the AWS IAM Role used by the fanout function to include the sts:AssumeRole action on your target account. You also need to configure the trust relationship of your target account AWS IAM Role to allow sts:AssumeRole calls from the source account.
This feature is not available for redis and memcached.
You can also send records to a target region, by specifying the region property in the configuration. This property will contain the name of the target region.
This feature is not available for redis and memcached.
To simplify the deployment, configuration and management of the function, a Bash Command Line Interface (CLI) is available. The command is ./fanout and is an automation of the AWS CLI.
Available commands are:
./fanout deploy: deploys the fanout function as a new AWS Lambda function, or updates the currently existing AWS Lambda function./fanout register <type>: creates a new mapping (source -> target) for the specified fanout function./fanout list: lists existing targets for the specified fanout function and source./fanout update: updates a mapping for the specified fanout function and source./fanout activate: activates a mapping for the specified fanout function./fanout deactivate: deactivates a mapping for the specified fanout function./fanout unregister: deletes a mapping for the specified fanout function./fanout destroy: destroys the fanout function
Some parameters (--exec-role, --table, --source) have a short form or a long form. The short form will call the AWS CLI to search for the element and either create it or raise an error if it does not exist, and the long form (the same parameters with -arn suffix: --exec-role-arn, --table-arn, --source-arn) will just accept the parameter as-is. This speeds up execution time, and allow scenarios where the user running the command does not have enough rights to discover the elements, while the function does.
Optional command line parameters are:
--region <region>,--profile <profile>,--debug <boolean>,--endpoint-url <url>,--no-verify-ssl: passed directly to the AWS CLI calls
The deploy command creates the fanout function and supporting resources (AWS IAM Role, Amazon DynamoDB Table). It expects the following parameters:
--function <fanout>: (optional, defaults tofanout) all commands expect an AWS Lambda function name to be provided.- This name will be used to derive the AWS IAM Role name (
<function-name>Role) if it is not provided - This name will be used to derive the Amazon DynamoDB Table name (
<function-name>Targets) if it is not provided. - You can specify the function by ARN and avoid detection by using
--function-arn <arn:aws:lambda:us-east-1:0123456789abcdef:function:fanout>
- This name will be used to derive the AWS IAM Role name (
--table <fanoutTargets>(optional) specify the table name to use for the function configuration. The table is created if it does not exist.- You can specify the table by ARN and avoid table detection and creation by using
--table-arn <arn:aws:dynamodb:us-east-1:0123456789abcdef:table/fanoutTargets>
- You can specify the table by ARN and avoid table detection and creation by using
--subnet <subnet-01234567>(optional) specify the subnet to use for executing the AWS Lambda function in VPC mode- this parameter can be repeated to specify multiple subnets for the function, recommended for multi-AZ scenarios
- this parameter can accept a comma-separated list of subnets
--subnet <subnet-01234567>,subnet-<89abcdef> - this parameter is required at least once if the
--security-groupparameter is used
--security-group <sg-01234567>(zero or multiple times) specify the security groups to use for executing the AWS Lambda function in VPC mode- this parameter can be repeated to specify multiple security groups for the function
- this parameter can accept a comma-separated list of security groups
--security-group <sg-01234567>,sg-<89abcdef> - this parameter is required at least once if the
--subnetparameter is used
--exec-role <fanoutRole>(optional) specify the AWS IAM Role to use for thefanout function. The role is created if it does not exist.- You can specify the role by ARN and avoid role detection and creation by using
--exec-role-arn <arn:aws:iam::0123456789abcdef::role/fanoutRole>
- You can specify the role by ARN and avoid role detection and creation by using
--memory <128>(optional, default 128) specify the amount of memory (in MiB) to use for the function--runtime <nodejs14.x>(optional, defaultnodejs14.x) specify the runtime environment for the Lambda function--timeout <30>(optional, default 30) specify the timeout of the function (in seconds)
Example:
./fanout deploy --function fanout
The list command retrieves all the existing mappings for a specific source. It expects the following parameters:
--function <fanout>(optional, defaults tofanout) specify the name of the function- You can specify the function by ARN and avoid detection by using
--function-arn <arn:aws:lambda:us-east-1:0123456789abcdef:function:fanout>
- You can specify the function by ARN and avoid detection by using
--source-type <kinesis|dynamodb>(required) specify the type of the source (one of Amazon Kinesis Stream or Amazon DynamoDB Stream)--source <kinesisStream>(required) specify the name of the input Amazon Kinesis Stream or Amazon DynamoDB Stream- You can specify the source by ARN and avoid detection by using
--source-arn <arn:aws:kinesis:us-east-1:0123456789abcdef:stream/inputStream> - When
--sourceis used, you must specify--source-typeas well
- You can specify the source by ARN and avoid detection by using
--table <fanoutTargets>(optional) specify the table name to use for the function configuration.- If not specified a default value of
<function-name>Targetswill be used. - You can specify the table by ARN and avoid table detection by using
--table-arn <arn:aws:dynamodb:us-east-1:0123456789abcdef:table/fanoutTargets>
- If not specified a default value of
Example:
./fanout list --function fanout --source-type kinesis --source inputStream
The register <type> command creates a new mapping for an existing fanout function. As the fanout function caches the configuration for performance reasons (default time of 1 minute), there may be a delay in the activation of this mapping. Note also that unless --active true is specified new mappings are created inactive for safety reasons. It expects the following parameters:
<type>(required) the type of the destination--function <fanout>(optional, defaults tofanout) specify the name of the function- You can specify the function by ARN and avoid detection by using
--function-arn <arn:aws:lambda:us-east-1:0123456789abcdef:function:fanout>
- You can specify the function by ARN and avoid detection by using
--source-type <kinesis|dynamodb>(required) specify the type of the source (one of Amazon Kinesis Stream or Amazon DynamoDB Stream)--source <kinesisStream>(required) specify the name of the input Amazon Kinesis Stream or Amazon DynamoDB Stream- You can specify the source by ARN and avoid detection by using
--source-arn <arn:aws:kinesis:us-east-1:0123456789abcdef:stream/inputStream> - When
--sourceis used, you must specify--source-typeas well
- You can specify the source by ARN and avoid detection by using
--id <mapping-id>(required) specify the identifier of this mapping--destination <name>(optional) where name depends on the provider type:./fanout create sns:- name of the target topic, the CLI will search for the topic in the specified region
- ARN of the target topic, detection will be deactivated
./fanout create sqs:- name of the target queue, the CLI will search for the queue in the specified region
- ARN of the target queue, detection will be deactivated
./fanout create es:- name of the target domain, the CLI will search for the endpoint in the specified region, you will need to specify the
--indexparameter - Composite string containing the FQDN of the target Amazon Elasticache Service domain endpoint, followed by '#' then the storage specification 'doctype/index'
- name of the target domain, the CLI will search for the endpoint in the specified region, you will need to specify the
./fanout create kinesis:- name of the target stream, the CLI will search for the stream in the specified region
- ARN of the target stream, detection will be deactivated
./fanout create firehose:- name of the target stream, the CLI will search for the stream in the specified region
- ARN of the target stream, detection will be deactivated
./fanout create iot:- name of the target topic
- Composite string containing the FQDN of the target Amazon IoT endpoint (specific per account / region), followed by '#' then the MQTT Topic Name
./fanout create lambda:- name of the target function, the CLI will search for the function in the specified region
- ARN of the target topic, detection will be deactivated
./fanout create memcached:- name of the target cluster, the CLI will search for the function in the specified region
- FQDN of the target cluster endpoint, detection will be deactivated
./fanout create redis:- name of the target cluster, the CLI will search for the function in the specified region
- FQDN of the target primary endpoint, detection will be deactivated
--index <doctype/index>(optional, required only foreswith domain name) specify, for Amazon Elasticsearch Service, the index in the domain where the data will reside--destination-region <us-west-2>(optional) specify thetarget regionfor this mapping--active <true|false>(optional, default false) indicates if this target is active or not--destination-role-arn <arn:aws:iam::0123456789abcdef::role/targetRole>(optional) specify, for cross-account roles, the role ARN that will be assumed--external-id <123456>(optional) specify, for cross-account roles, an external Id for the STS:AssumeRole call--collapse <none|JSON|concat|concat-b64>(optional, default JSON) for AWS IoT, Amazon SQS and Amazon SNS, defines how the messages should be collapsed, if at all--parallel <true|false>(optional, default true) indicates if we should process sending these messages in parallel--convert-ddb <true|false>(optional, default false) for Amazon DynamoDB Streams messages, converts the DDB objects to plain Javascript objects--deaggregate <true|false>(optional, default false) for Amazon Kinesis Streams messages, deserializes KPL (protobuf-based) messages--table <fanoutTargets>(optional) specify the table name to use for the function configuration.- If not specified a default value of
<function-name>Targetswill be used. - You can specify the table by ARN and avoid table detection by using
--table-arn <arn:aws:dynamodb:us-east-1:0123456789abcdef:table/fanoutTargets>
- If not specified a default value of
Example:
./fanout register lambda --function fanout --source-type kinesis --source inputStream --id target1 --destination targetFunction
The update command allows you to modify some parameters of your mappings. As the fanout function caches the configuration for performance reasons (default time of 1 minute), there may be a delay in the application of the modification. It expects the following parameters:
--function <fanout>(optional, defaults tofanout) specify the name of the function- You can specify the function by ARN and avoid detection by using
--function-arn <arn:aws:lambda:us-east-1:0123456789abcdef:function:fanout>
- You can specify the function by ARN and avoid detection by using
--source-type <kinesis|dynamodb>(required) specify the type of the source (one of Amazon Kinesis Stream or Amazon DynamoDB Stream)--source <kinesisStream>(required) specify the name of the input Amazon Kinesis Stream or Amazon DynamoDB Stream- You can specify the source by ARN and avoid detection by using
--source-arn <arn:aws:kinesis:us-east-1:0123456789abcdef:stream/inputStream> - When
--sourceis used, you must specify--source-typeas well
- You can specify the source by ARN and avoid detection by using
--id <mapping-id>(required) specify the identifier of this mapping--active <true|false>(optional) indicates if this target is active or not--destination-role-arn <arn:aws:iam::0123456789abcdef::role/targetRole>(optional) specify, for cross-account roles, the role ARN that will be assumed--external-id <123456>(optional) specify, for cross-account roles, an external Id for the STS:AssumeRole call--collapse <none|JSON|concat|concat-b64>(optional, default JSON) for AWS IoT, Amazon SQS and Amazon SNS, defines how the messages should be collapsed, if at all--parallel <true|false>(optional) indicates if we should process sending these messages in parallel--convert-ddb <true|false>(optional, default false) for Amazon DynamoDB Streams messages, converts the DDB objects to plain Javascript objects--deaggregate <true|false>(optional, default false) for Amazon Kinesis Streams messages, deserializes KPL (protobuf-based) messages--table <fanoutTargets>(optional) specify the table name to use for the function configuration.- If not specified a default value of
<function-name>Targetswill be used. - You can specify the table by ARN and avoid table detection by using
--table-arn <arn:aws:dynamodb:us-east-1:0123456789abcdef:table/fanoutTargets>
- If not specified a default value of
Example:
./fanout update --function fanout --source-type kinesis --source inputStream --id target1 --parallel false
The activate and deactivate commands turn on or off a specific mapping. As the fanout function caches the configuration for performance reasons (default time of 1 minute), there may be a delay in the application of the modification. They expect the following parameters:
--function <fanout>(optional, defaults tofanout) specify the name of the function- You can specify the function by ARN and avoid detection by using
--function-arn <arn:aws:lambda:us-east-1:0123456789abcdef:function:fanout>
- You can specify the function by ARN and avoid detection by using
--source-type <kinesis|dynamodb>(required) specify the type of the source (one of Amazon Kinesis Stream or Amazon DynamoDB Stream)--source <kinesisStream>(required) specify the name of the input Amazon Kinesis Stream or Amazon DynamoDB Stream- You can specify the source by ARN and avoid detection by using
--source-arn <arn:aws:kinesis:us-east-1:0123456789abcdef:stream/inputStream> - When
--sourceis used, you must specify--source-typeas well
- You can specify the source by ARN and avoid detection by using
--id <mapping-id>(required) specify the identifier of this mapping--table <fanoutTargets>(optional) specify the table name to use for the function configuration.- If not specified a default value of
<function-name>Targetswill be used. - You can specify the table by ARN and avoid table detection by using
--table-arn <arn:aws:dynamodb:us-east-1:0123456789abcdef:table/fanoutTargets>
- If not specified a default value of
Example:
./fanout activate --function fanout --source-type kinesis --source inputStream --id target1
./fanout deactivate --function fanout --source-type kinesis --source inputStream --id target1
The unregister command removes an existing mapping fromt the configuration table. As the fanout function caches the configuration for performance reasons (default time of 1 minute), there may be a delay before the target is effectively removed. It expects the following parameters:
--function <fanout>(optional, defaults tofanout) specify the name of the function- You can specify the function by ARN and avoid detection by using
--function-arn <arn:aws:lambda:us-east-1:0123456789abcdef:function:fanout>
- You can specify the function by ARN and avoid detection by using
--source-type <kinesis|dynamodb>(required) specify the type of the source (one of Amazon Kinesis Stream or Amazon DynamoDB Stream)--source <kinesisStream>(required) specify the name of the input Amazon Kinesis Stream or Amazon DynamoDB Stream- You can specify the source by ARN and avoid detection by using
--source-arn <arn:aws:kinesis:us-east-1:0123456789abcdef:stream/inputStream> - When
--sourceis used, you must specify--source-typeas well
- You can specify the source by ARN and avoid detection by using
--id <mapping-id>(required) specify the identifier of this mapping--table <fanoutTargets>(optional) specify the table name to use for the function configuration.- If not specified a default value of
<function-name>Targetswill be used. - You can specify the table by ARN and avoid table detection by using
--table-arn <arn:aws:dynamodb:us-east-1:0123456789abcdef:table/fanoutTargets>
- If not specified a default value of
Example:
./fanout unregister --function fanout --source-type kinesis --source inputStream --id target1
The destroy command removes the fanout function and its configuration. It expects the following parameters:
--function <fanout>(optional, defaults tofanout) specify the name of the function
Example:
./fanout destroy --function fanout
The hook and unhook commands register and unregister an event source mapping for the specified source. They expect the following parameters:
--function <fanout>(optional, defaults tofanout) specify the name of the function- You can specify the function by ARN and avoid detection by using
--function-arn <arn:aws:lambda:us-east-1:0123456789abcdef:function:fanout>
- You can specify the function by ARN and avoid detection by using
--source-type <kinesis|dynamodb>(required) specify the type of the source (one of Amazon Kinesis Stream or Amazon DynamoDB Stream)--source <kinesisStream>(required) specify the name of the input Amazon Kinesis Stream or Amazon DynamoDB Stream- You can specify the source by ARN and avoid detection by using
--source-arn <arn:aws:kinesis:us-east-1:0123456789abcdef:stream/inputStream> - When
--sourceis used, you must specify--source-typeas well
- You can specify the source by ARN and avoid detection by using
--table <fanoutTargets>(optional) specify the table name to use for the function configuration.- If not specified a default value of
<function-name>Targetswill be used. - You can specify the table by ARN and avoid table detection by using
--table-arn <arn:aws:dynamodb:us-east-1:0123456789abcdef:table/fanoutTargets>
- If not specified a default value of
Example:
./fanout hook --function fanout --source-type kinesis --source inputStream
./fanout unhook --function fanout --source-type kinesis --source inputStream
The pause and resume commands turn on or off a specific source. They will respectly disable and enable the event source mapping from the source to the fanout function. They expect the following parameters:
--function <fanout>(optional, defaults tofanout) specify the name of the function- You can specify the function by ARN and avoid detection by using
--function-arn <arn:aws:lambda:us-east-1:0123456789abcdef:function:fanout>
- You can specify the function by ARN and avoid detection by using
--source-type <kinesis|dynamodb>(required) specify the type of the source (one of Amazon Kinesis Stream or Amazon DynamoDB Stream)--source <kinesisStream>(required) specify the name of the input Amazon Kinesis Stream or Amazon DynamoDB Stream- You can specify the source by ARN and avoid detection by using
--source-arn <arn:aws:kinesis:us-east-1:0123456789abcdef:stream/inputStream> - When
--sourceis used, you must specify--source-typeas well
- You can specify the source by ARN and avoid detection by using
--table <fanoutTargets>(optional) specify the table name to use for the function configuration.- If not specified a default value of
<function-name>Targetswill be used. - You can specify the table by ARN and avoid table detection by using
--table-arn <arn:aws:dynamodb:us-east-1:0123456789abcdef:table/fanoutTargets>
- If not specified a default value of
Example:
./fanout pause --function fanout --source-type kinesis --source inputStream
./fanout resume --function fanout --source-type kinesis --source inputStream
Copyright 2011-2013 Amazon.com, Inc. or its affiliates. All Rights Reserved.
Licensed under the Apache License, Version 2.0 (the "License"). You may not use this file except in compliance with the License. A copy of the License is located at
http://aws.amazon.com/apache2.0/
or in the "license" file accompanying this file. This file is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.
