Output
Abstract Classapplication_sdk.outputsOutput classes provide a unified interface for writing data to various destinations in the Application SDK. All output classes inherit from the base `Output` class and support writing data from pandas or daft DataFrames, with automatic chunking, buffering, and object store upload capabilities. All output classes share a common foundation through the base `Output` class, which provides consistent behavior across all output types. This includes automatic chunking, buffer management, statistics tracking, and object store uploads.
Properties
output_pathstroutput_prefixstrtotal_record_countintchunk_countintchunk_partintbuffer_sizeintmax_file_size_bytesintpartitionsList[int]Methods7
write_daft_dataframe
async write_daft_dataframe(self, dataframe: daft.DataFrame) -> NoneParameters
dataframedaft.DataFramewrite_dataframe
async write_dataframe(self, dataframe: pd.DataFrame) -> NoneParameters
dataframepd.DataFramewrite_batched_dataframe
async write_batched_dataframe(self, batch_df: pd.DataFrame) -> NoneParameters
batch_dfpd.DataFramewrite_batched_daft_dataframe
async write_batched_daft_dataframe(self, batch_daft_df: daft.DataFrame) -> NoneParameters
batch_daft_dfdaft.DataFrameget_statistics
async get_statistics(self, typename: Optional[str] = None) -> ActivityStatisticsParameters
typenameOptional[str]Returns
ActivityStatistics - Object containing output statisticspath_gen
path_gen(self, chunk_count: Optional[int] = None, chunk_part: int = 0, start_marker: Optional[str] = None, end_marker: Optional[str] = None) -> strParameters
chunk_countOptional[int]chunk_partintstart_markerOptional[str]end_markerOptional[str]Returns
str - Generated file pathprocess_null_fields
process_null_fields(self, obj: Any, preserve_fields: Optional[List[str]] = None, null_to_empty_dict_fields: Optional[List[str]] = None) -> AnyParameters
objAnypreserve_fieldsOptional[List[str]]null_to_empty_dict_fieldsOptional[List[str]]Returns
Any - Cleaned object with null values removedWriteMode enum
The WriteMode enum defines the available write modes for output operations:
class WriteMode(Enum):
APPEND = "append" # Append data to existing files
OVERWRITE = "overwrite" # Overwrite existing files
OVERWRITE_PARTITIONS = "overwrite-partitions" # Overwrite specific partitions
Output implementations
The Application SDK provides three concrete implementations of the base Output class, each optimized for different data formats and storage requirements. All implementations inherit the common functionality from the base class, including automatic chunking, buffer management, statistics tracking, and object store uploads.
ParquetOutput
Columnar FormatWrites data to Parquet files with support for chunking, consolidation, Hive partitioning, and automatic object store uploads.
JsonOutput
JSONL FormatWrites data to JSON files (JSONL format) with support for chunking, buffering, null field processing, and automatic object store uploads.
IcebergOutput
Table FormatWrites data to Apache Iceberg tables using daft. Supports table creation, schema inference, and multiple write modes.
See also
- Inputs: Read data from various sources including SQL queries, Parquet files, JSON files, and Iceberg tables
- Application SDK README: Overview of the Application SDK and its components
- App structure: Standardized folder structure for Atlan applications
- StateStore: Persistent state management for workflows and credentials