sparrow-ipc 0.2.0
Loading...
Searching...
No Matches
sparrow_ipc::serializer Class Reference

A class for serializing Apache Arrow record batches to an output stream. More...

#include <serializer.hpp>

Public Member Functions

template<writable_stream TStream>
 serializer (TStream &stream, std::optional< CompressionType > compression=std::nullopt)
 Constructs a serializer object with a reference to a stream.
 
 ~serializer ()
 Destructor for the serializer.
 
void write (const sparrow::record_batch &rb)
 Writes a record batch to the serializer.
 
template<std::ranges::input_range R>
requires std::same_as<std::ranges::range_value_t<R>, sparrow::record_batch>
void write (const R &record_batches)
 Writes a collection of record batches to the stream.
 
serializeroperator<< (const sparrow::record_batch &rb)
 
template<std::ranges::input_range R>
requires std::same_as<std::ranges::range_value_t<R>, sparrow::record_batch>
serializeroperator<< (const R &record_batches)
 
serializeroperator<< (serializer &(*manip)(serializer &))
 
void end ()
 Finalizes the serialization process by writing end-of-stream marker.
 

Detailed Description

A class for serializing Apache Arrow record batches to an output stream.

The serializer class provides functionality to serialize single or multiple record batches into a binary format suitable for storage or transmission. It ensures schema consistency across multiple record batches and optimizes memory allocation by pre-calculating required buffer sizes.

The serializer supports two main usage patterns:

  1. Construction with a collection of record batches for batch serialization
  2. Construction with a single record batch followed by incremental appends

The class validates that all record batches have consistent schemas and throws std::invalid_argument if inconsistencies are detected or if an empty collection is provided.

Memory efficiency is achieved through:

  • Pre-calculation of total serialization size
  • Stream reservation to minimize memory reallocations
  • Lazy evaluation of size calculations using lambda functions

Definition at line 34 of file serializer.hpp.

Constructor & Destructor Documentation

◆ serializer()

template<writable_stream TStream>
sparrow_ipc::serializer::serializer ( TStream & stream,
std::optional< CompressionType > compression = std::nullopt )
inline

Constructs a serializer object with a reference to a stream.

Template Parameters
TStreamThe type of the stream to be used for serialization.
Parameters
streamReference to the stream object that will be used for serialization operations. The serializer stores a pointer to this stream for later use.
compressionOptional: The compression type to use for record batch bodies.

Definition at line 47 of file serializer.hpp.

Here is the caller graph for this function:

◆ ~serializer()

sparrow_ipc::serializer::~serializer ( )

Destructor for the serializer.

Ensures proper cleanup by calling end() if the serializer has not been explicitly ended. This guarantees that any pending data is flushed and resources are properly released before the object is destroyed.

Member Function Documentation

◆ end()

void sparrow_ipc::serializer::end ( )

Finalizes the serialization process by writing end-of-stream marker.

This method writes an end-of-stream marker to the output stream and flushes any buffered data. It can be called multiple times safely as it tracks whether the stream has already been ended to prevent duplicate operations.

Note
This method is idempotent - calling it multiple times has no additional effect.
Postcondition
After calling this method, m_ended will be set to true.
Examples
/home/runner/work/sparrow-ipc/sparrow-ipc/include/sparrow_ipc/serializer.hpp.
Here is the caller graph for this function:

◆ operator<<() [1/3]

template<std::ranges::input_range R>
requires std::same_as<std::ranges::range_value_t<R>, sparrow::record_batch>
serializer & sparrow_ipc::serializer::operator<< ( const R & record_batches)
inline

Definition at line 175 of file serializer.hpp.

Here is the call graph for this function:

◆ operator<<() [2/3]

serializer & sparrow_ipc::serializer::operator<< ( const sparrow::record_batch & rb)
inline

Definition at line 149 of file serializer.hpp.

Here is the call graph for this function:

◆ operator<<() [3/3]

serializer & sparrow_ipc::serializer::operator<< ( serializer &(* manip )(serializer &))
inline

Definition at line 195 of file serializer.hpp.

Here is the call graph for this function:

◆ write() [1/2]

template<std::ranges::input_range R>
requires std::same_as<std::ranges::range_value_t<R>, sparrow::record_batch>
void sparrow_ipc::serializer::write ( const R & record_batches)
inline

Writes a collection of record batches to the stream.

This method efficiently adds multiple record batches to the serialization stream by first calculating the total required size and reserving memory space to minimize reallocations during the append operations.

Template Parameters
RThe type of the record batch collection (must be iterable)
Parameters
record_batchesA collection of record batches to append to the stream

The method performs the following operations:

  1. Calculates the total size needed for all record batches
  2. Reserves the required memory space in the stream
  3. Iterates through each record batch and adds it to the stream

Definition at line 85 of file serializer.hpp.

Here is the call graph for this function:

◆ write() [2/2]

void sparrow_ipc::serializer::write ( const sparrow::record_batch & rb)

Writes a record batch to the serializer.

Parameters
rbThe record batch to write to the serializer
Here is the caller graph for this function:

The documentation for this class was generated from the following file: