What does Atlan crawl from Amazon MSK?
Once you've crawled Amazon MSK, you can use connector-specific filters for quick asset discovery. The following filters are currently supported for these assets:
- Topics - Message count, size (MB), partition count, and cleanup policy filters
- Consumer groups - Member count and topic name filters
Atlan crawls and maps the following assets and properties from Amazon MSK.
Atlan currently only supports asset-level lineage between topics and consumer groups. Upstream, downstream, and column-level lineage are currently not supported.
Topics
Atlan maps topics from Amazon MSK to its KafkaTopic asset type.
| Source property | Atlan property |
|---|---|
| Topic | name |
| PartitionCount | kafkaTopicPartitionsCount |
| ReplicationFactor | kafkaTopicReplicationFactor |
| segment.byte | kafkaTopicSegmentBytes |
| compression.type | kafkaTopicCompressionType |
| cleanup.policy | kafkaLogTopicCleanupPolicy |
| isInternal | kafkaTopicIsInternal |
| sizeInBytes | kafkaTopicSizeInBytes |
| recordCount | kafkaTopicRecordCount |
| retention.ms | kafkaTopicRetentionTimeInMs |
Consumer groups
Atlan maps consumer groups from Amazon MSK to its KafkaConsumerGroup asset type.
Consumer groups are most likely to show up only in streaming scenarios. This is because if a topic is not being consumed actively, Amazon MSK will purge the consumer group. So, if a consumer group is inactive while the workflow runs in Atlan, it will not be cataloged as an asset.
| Source property | Atlan property |
|---|---|
| GROUP | name |
| memberCount | kafkaConsumerGroupMemberCount |
| ReplicationFactor | kafkaTopicReplicationFactor |
| topic_names | kafkaTopicNames |
| TOPIC | kafkaConsumerGroupTopicConsumptionProperties.topicName |
| PARTITION | kafkaConsumerGroupTopicConsumptionProperties.topicPartition |
| LAG | kafkaConsumerGroupTopicConsumptionProperties.topicLag |
| CURRENT-OFFSET | kafkaConsumerGroupTopicConsumptionProperties.topicCurrentOffset |