-
Notifications
You must be signed in to change notification settings - Fork 178
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add the InfluxDB Timestream connector #201
Draft
trevorbonas
wants to merge
11
commits into
awslabs:mainline
Choose a base branch
from
Bit-Quill:influxdb-timestream-connector
base: mainline
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Draft
Add the InfluxDB Timestream connector #201
trevorbonas
wants to merge
11
commits into
awslabs:mainline
from
Bit-Quill:influxdb-timestream-connector
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Signed-off-by: forestmvey <[email protected]>
*Issue #, if available:* N/A. *Description of changes:* - A pre-commit hook has been added that uses `aws-secrets` in order to prevent secrets from being committed. By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.
*Issue #, if available:* N/A *Description of changes:* - SAM template `template.yml` added. - Asynchronous and synchronous invocation supported. - Documentation for how the connector can be deployed using the SAM template added. - `lambda_runtime` crate added. - `LambdaEvent<serde_json::Value>` used instead of `lambda_http::Request`, in order to support more types of requests, instead of just AWS service integrations. - Tests added for handling different kinds of `queryParameters` keys in requests. - DLQ implemented for asynchronous invocation. - Integration tests changed to check the newly returned `serde_json::Value` struct. - [x] Unit tests passed. - [x] Integration tests passed. By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.
- Users can now define partition keys for their newly-created tables. Partition key configuration is controlled using three environment variables, `custom_partition_key_type`, `custom_partition_key_dimension`, and `enforce_custom_partition_key`. - Environment variables added for CDPK support. - SAM template parameters added for CDPK support. - Documentation added for CDPK support. - Integration tests added for CDPK support. - Integration test asserts moved to end of tests, in order to ensure resources are cleaned up. - [x] Unit tests passed. - [x] Integration tests passed.
*Issue #, if available:* N/A. *Description of changes:* - Deployment permissions have been updated according to the required permissions as discovered by testing deploying the connector using its SAM template. - `samconfig.toml` added with default stack deployment options. - "Troubleshooting" section added to the README with two known errors users may encounter. - Issue with cross-platform Rust compilation. - ConflictExceptions occurring with concurrent instances of the connector. By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.
- The environment variable `local_invocation` has been added and is set to `true` by default when the Lambda function is run with `cargo lambda watch`. This environment enables responses to be returned in a format `cargo lambda` expects, in the 2.0 formatting. Otherwise, the 1.0 format will be returned, which the synchronous API Gateway expects. - `cargo fmt` run. - [x] Tested locally. - [x] Tested with synchronous invocation. - [x] Tested with asynchronous invocation.
*Issue #, if available:* N/A. *Description of changes:* - Ingestion to multiple tables done in parallel. - Ingestion of 100 records to a single table done in parallel. - Chunking of records into batches of 100 done in parallel. - Limit on maximum possible number of threads added. - Print statement for each 100 records removed. - Logging option added to SAM template. - All `println!` calls changed to `info!`. - Default logging level for the connector changed to `INFO`. - Trace statements that measure the execution time of each function added. - Instructions added to README for how to configure logging levels. - Database and tables are no longer checked if their corresponding environment variables for creation are not set. - The checking of whether tables exist has been moved within the asynchronous code block used to ingest records to a table. This checking is now done in parallel and the hashmap is looped through once, instead of twice. - Instructions added to README for how to reduce stack costs. - [x] Integration tests passed. - [x] Unit tests passed. By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.
* Initial implementation for adding single-table mapping. Signed-off-by: forestmvey <[email protected]> * Adding environment variables and updating documentation. Signed-off-by: forestmvey <[email protected]> * Adding tests and revising single-table mapping. Signed-off-by: forestmvey <[email protected]> * Adding single table example to README. Signed-off-by: forestmvey <[email protected]> * Fixing measure-name using line protocol metric name for single table mapping. Signed-off-by: forestmvey <[email protected]> * Fix typo in README. Signed-off-by: forestmvey <[email protected]> * Fixing documentation typos and test setting wrong environment variable. Signed-off-by: forestmvey <[email protected]> * Fixing table for single table multi measure records example in README. Signed-off-by: forestmvey <[email protected]> * Fix invalid line protocol examples with additional comma separating the timestamp. Signed-off-by: forestmvey <[email protected]> --------- Signed-off-by: forestmvey <[email protected]>
* Set default table mapping to multi-table for the InfluxDB Timestream Connector. Signed-off-by: forestmvey <[email protected]> * Fixing table formatting in README for InfluxDB Timestream Connector. Signed-off-by: forestmvey <[email protected]> * Updating README to reference multi-table for the default table mapping. Signed-off-by: forestmvey <[email protected]> --------- Signed-off-by: forestmvey <[email protected]>
* Add gzip support to Go client * Add gzip support to template * Ignore custom_partition_key_type if it is invalid option * Add comment about lack of local gzip support
…DB Timestream Connector (#33) * Defining least-privilege Lambda invocation permissions for the connector. Adding output for IAM policy when deploying using the SAM template. Signed-off-by: forestmvey <[email protected]> * Making stage name dynamic for output least privilege IAM policy. Signed-off-by: forestmvey <[email protected]> * Adding section in README for ingestion permissions of Lambda function. Signed-off-by: forestmvey <[email protected]> * Revising wording for IAM permissions in README. Signed-off-by: forestmvey <[email protected]> --------- Signed-off-by: forestmvey <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Issue #, if available:
N/A.
Description of changes:
The InfluxDB Timestream connector has been added.
The InfluxDB Timestream connector is a Rust application that receives, translates, and ingests line protocol data into Timestream for LiveAnalytics. The connector is intended to be deployed as part of a CloudFormation stack as a Lambda function, but can also be used as a library or run locally as a Lambda function.
The connector includes the following components:
A few things to note:
The connector splits up batches and ingests records in chunks of 100 at a time in parallel. This means that if a record in one of these chunks has an issue, an error will occur and ingestion will stop but other records in the request batch, records that were processed before or in parallel with the chunk with the erroneous record, will be successfully ingested. When asynchronous invocation is used, the entire request will be added to the Lambda's dead letter queue. Rejected records and AWS SDK for Rust errors are logged. This will help users narrow down the problem among the line protocol points in their batch.
The connector uses the InfluxDB v3 parser, since InfluxDB v3 is written in Rust and its parser is available as a crate. This parser has a few inconsistencies in behavior compared to the InfluxDB v2 parser.
93 tests have been created for the connector, 40 integration tests and 53 unit tests. The following is a list of all tests with a short summary for each:
test_mtmm_basic
integration test: Tests ingesting a single line protocol point.test_mtmm_create_database
integration test: Tests ingesting a single line protocol point and creating a database.test_mtmm_unusual_query_parameters
integration test: Tests ingesting a single line protocol point with a query parameters keys with unusual spelling.test_mtmm_no_query_parameters
integration test: Tests ingesting a single line protocol point without query parameters.test_mtmm_multiple_timestamps
integration test: Tests ingesting a single line protocol point with two timestamps. This test expects failure.test_mtmm_many_tags_many_fields
integration test: Test ingesting a single line protocol point with 50 tags and 50 fields.test_mtmm_float
integration test: Tests ingesting a single line protocol point with a float value for the field.test_mtmm_string
integration test: Tests ingesting a single line protocol point with a string value for the field.test_mtmm_bool
integration test: Tests ingesting a single line protocol point with a bool value for the field.test_mtmm_max_tag_length
integration test: Tests ingesting a single line protocol point with a tag where the tag's key has length 60, the maximum allowed dimension name length, and its value is 1988 characters long. The length of the tag key and tag value together amount to the maximum size for a dimension pair, 2 kilobytes.test_mtmm_beyond_max_tag_length
integration test: Tests ingesting a single line protocol point with a tag where the tag's key has length 60, the maximum allowed dimension name length, and its value is 1989 characters long. The length of the tag key and tag value together exceed the maximum size for a dimension pair, 2 kilobytes. This test expects failure.test_mtmm_max_field_length
integration test: Tests ingesting a single line protocol point with a field where the length of the field key is the maximum measure name length, 256 chars, and the length of the field value is the maximum measure value size, 2048.test_mtmm_beyond_max_field_length
integration test: Tests ingesting a single line protocol point with a field where the length of the field key is the maximum measure name length, 256, and the length of the field value is beyond the maximum measure value size, 2048. This test expects failure.test_mtmm_max_unique_field_keys
integration test: Tests ingesting a batch of line protocol points where the number of unique field keys in the batch equals the maximum number of allowed unique measures for a single table, 1024.test_mtmm_beyond_max_unique_field_keys
integration test: Tests ingesting a batch of line protocol points where the number of unique field keys in the batch exceeds the maximum number of allowed unique measures for a single table, 1024. This test expects failure.test_mtmm_max_unique_tag_keys
integration test: Tests ingesting a batch of line protocol points where the number of unique tag keys in the batch equals the maximum number of allowed unique dimensions for a single table, 128.test_mtmm_beyond_max_unique_tag_keys
integration test: Tests ingesting a batch of line protocol points where the number of unique tag keys in the batch exceeds the maximum number of unique dimensions for a single table, 128. This test expects failure.test_mtmm_max_table_name_length
integration test: Tests ingesting a single line protocol point with a measurement with length equal to the maximum number of bytes a table name can have.test_mtmm_beyond_max_table_name_length
integration test: Tests ingesting a single line protocol point with a measurement with length exceeding the maximum number of bytes a table name can have. This test expects failure.test_mtmm_nanosecond_precision
integration test: Tests ingesting a single line protocol point with nanosecond precision.test_mtmm_microsecond_precision
integration test: Tests ingesting a single line protocol point with microsecond precision.test_mtmm_second_precision
integration test: Tests ingesting a single line protocol point with second precision.test_mtmm_no_precision
integration test: Tests ingesting a single line protocol point without precision specified.test_mtmm_empty_point
: Tests ingesting an empty string.test_mtmm_small_timestamp
integration test: Tests ingesting a single line protocol point with a single-digit millisecond timestamp.test_mtmm_5_measurements
integration test: Tests ingesting a batch of line protocol points with five measurements.test_mtmm_100_measurements
integration test: Tests ingesting a batch of line protocol points with one-hundred measurements.test_mtmm_5000_batch
integration test: Tests ingesting a batch of 5000 line protocol points with a single measurement.test_mtmm_no_credentials
integration test: Tests ingesting without AWS credentials. This test expects a panic.test_mtmm_incorrect_credentials
integration test: Tests ingesting with incorrect AWS credentials. This tests expects a panic.test_mtmm_custom_dimension_partition_key_optional_enforcement
integration test: Tests ingesting a single line protocol point and specifying a valid configuration for a custom dimension partition key with optional enforcement.test_mtmm_custom_dimension_partition_key_required_enforcement_accepted
integration test: Tests ingesting a single line protocol point and specifying a valid configuration for a custom dimension partition key with required enforcement and a successful ingestion.test_mtmm_custom_dimension_partition_key_required_enforcement_rejected
integration test: Tests ingesting a single line protocol point and specifying a valid configuration for a custom dimension partition key with required enforcement and an unsuccessful ingestion. This test expects failure.test_mtmm_custom_dimension_partition_key_no_dimension
integration test: Tests ingesting a single line protocol point and specifying a configuration for a custom dimension partition key without a dimension specified. This test expects failure.test_mtmm_custom_dimension_partition_key_no_enforcement
integration test: Tests ingesting a single line protocol point and specifying a configuration for a custom dimension partition key without an enforcement configuration specified. This test expects failure.test_mtmm_custom_measure_partition_key
integration test: Tests ingesting a single line protocol point and specifying a valid configuration for a custom measure partition key.test_mtmm_custom_measure_partition_key_with_dimension
integration test: Tests ingesting a single line protocol point and specifying a valid configuration for a custom measure partition key with a dimension specified. The dimension should be ignored.test_mtmm_custom_measure_partition_key_with_enforcement
integration test: Tests ingesting a single line protocol point and specifying a valid configuration for a custom measure partition key with an enforcement configuration specified. The enforcement configuration should be ignored.test_stmm_basic
integration test: Tests ingesting a single line protocol point using the single-table multi measure schema.test_stmm_varying_metrics
integration test: Tests ingesting a batch with 100 different measurements using single-table multi measure schema.test_get_precision_query_string_parameters_array
unit test: Tests getting the precision from an incoming request where queryStringParameters is an array, with the precision contained within the array.test_get_precision_query_string_parameters_object
unit test: Tests getting the precision from an incoming request where queryStringParameters is a JSON object, with the precision contained within the object with the key "precision".test_get_precision_query_string_parameters_object_nanoseconds
unit test: Tests getting the precision from an incoming request where queryStringParameters is a JSON object and the precision is nanoseconds.test_get_precision_query_string_parameters_object_microseconds
unit test: Tests getting the precision from an incoming request where queryStringParameters is a JSON object and the precision is microseconds.test_get_precision_query_string_parameters_object_seconds
unit test: Tests getting the precision from an incoming request where queryStringParameters is a JSON object and the precision is seconds.test_get_precision_query_parameters_array
unit test: Tests getting the precision from an incoming request where queryStringParameters is instead named queryParameters and is a JSON object and the precision is provided as an array.test_get_precision_query_parameters_object
unit test: Tests getting the precision from an incoming request where queryStringParameters is instead named queryParameters and is a JSON object and the precision is provided as a JSON object.test_get_precision_incorrect_query_parameters_key
unit test: Tests getting the precision from an incoming request where the queryStringParameters key is neither queryStringParameters nor queryParameters. This test expects failure.test_get_precision_incorrect_precision_key
unit test: Tests getting the precision from an incoming request where the precision key has been named incorrectly.test_parse_field_integer
unit test: Tests parsing a line protocol point with an integer field value.test_parse_field_float
unit test: Tests parsing a line protocol point with a float field value.test_parse_field_string_double_quote
unit test: Tests parsing a line protocol point with a string field value using double quotes.test_parse_field_string_single_quote
unit test: Tests parsing a line protocol point with a string field value using single quotes.test_parse_field_boolean
unit test: Tests parsing a line protocol point with a boolean field value.test_parse_field_boolean_invalid
: Tests parsing a line protocol point with an invalid boolean field value. This test expects failure.test_parse_measurement_unescaped_equals
unit test: Tests parsing a line protocol point where the measurement name includes an unescaped equals sign.test_parse_measurement_underscore_begin
unit test: Tests parsing a line protocol point where the measurement name begins with an underscore.test_parse_no_fields
unit test: Tests parsing a line protocol point without any fields.test_parse_multiple_fields
unit test: Tests parsing a line protocol point with multiple fields.test_parse_multiple_measurements
unit test: Test parsing a line protocol point with multiple measurement names. This test expects failure. This test is currently skipped due to an inconsistency between the InfluxDB v2 and v3 line protocol parsers. An issue relating to this inconsistency has been opened here.test_parse_no_timestamp
unit test: Tests parsing a line protocol point without a timestamp. This test expects failure.test_parse_non_unix_timestamp
unit test: Tests parsing a line protocol point with a non-unix timestamp. This test expects failure.test_parse_timestamp_with_quotes
unit test: Tests parsing a line protocol point where the timestamp is in double quotations. This test expects failure.test_parse_no_whitespace
unit test: Tests parsing a line protocol point with no whitespace between components. This test expects failure.test_parse_multiple_timestamps
unit test: Tests parsing a line protocol point with multiple timestamps. This test expects failure.test_parse_batch
unit test: Tests parsing multiple line protocol points with integer field values.test_parse_emojis
unit test: Tests parsing a line protocol point with emojis.test_parse_escaped_comma
unit test: Tests parsing a line protocol point with escaped commas included in different parts of the point.test_parse_escaped_equals
unit test: Tests parsing a line protocol point with escaped equals signs in different parts of the point.test_parse_unescaped_equals_measurement
unit test: Tests parsing a line protocol point with an unescaped equals sign in the measurement name.test_parse_escaped_equals_measurement
unit test: Tests parsing a single line protocol point with an unescaped equals sign in the measurement name.test_parse_escaped_space
unit test: Tests parsing a line protocol point with escaped spaces in different parts of the point.test_parse_measurement_escaped_space
unit test: Tests parsing a line protocol point with an escaped space in the measurement name.test_parse_escaped_double_quote_field_value
unit test: Tests parsing a line protocol point with escaped quotes in the field value.test_parse_non_escaped_double_quote_field_value
unit test: Tests parsing a line protocol point with unescaped quotes in the field value. This test expects failure.test_parse_escaped_backslash_field_value
unit test: Tests parsing a line protocol point with escaped backslashes in the field value.test_parse_escaped_backslash_not_field_value
unit test: Tests parsing a line protocol point with escaped backslashes in different parts of the point, excluding the field value.test_parse_single_point_single_comment
unit test: Tests parsing a line protocol point with one comment included.test_parse_single_point_multiple_comments
unit test: Tests parsing a line protocol point with two comments included.test_parse_multiple_points_single_comment
unit test: Tests parsing two line protocol points with one comment included.test_parse_multiple_points_multiple_comments
unit test: Tests parsing two line protocol points with two comments included.test_parse_nanoseconds_timestamp
unit test: Tests parsing a line protocol point with a nanosecond timestamp.test_parse_microseconds_timestamp
unit test: Tests parsing a line protocol point with a microsecond timestamp.test_parse_milliseconds_timestamp
unit test: Tests parsing a line protocol point with a millisecond timestamp.test_parse_seconds_timestamp
unit test: Tests parsing a line protocol point with a second timestamp.test_parse_empty
unit test: Tests parsing empty line protocol.test_mtmm_single_record
unit test: Tests building a single record with multi-table multi measure schema.test_mtmm_single_destination
unit test: Tests building records where all records go to the same table with multi-table multi measure schema.test_mtmm_multi_record
unit test: Tests building records where records to go different tables with multi-table multi measure schema.test_mtmm_empty_dimensions
unit test: Tests building records with empty dimensions for multi-table multi measure schema.test_mtmm_varying_timestamp_records
unit test: Tests building records where records have varying timestamps for multi-table multi measure schema.test_stmm_single_record
unit test: Tests building a single record with one measure for single-table multi measure schema.test_stmm_multi_record
unit test: Tests building records with different measurement names destined for the same table with single-table multi measure schema.By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.