sparrow-extensions 0.1.0
Extension types for the sparrow library
Loading...
Searching...
No Matches
sparrow_extensions Namespace Reference

Classes

struct  uuid_extension
 UUID array implementation following Arrow canonical extension specification. More...
 

Typedefs

using bool8_array = sparrow::primitive_array<int8_t, sparrow::simple_extension<"arrow.bool8">, bool>
 Bool8 array using 8-bit storage for boolean values.
 
using json_extension = sparrow::simple_extension<"arrow.json">
 
using json_array
 JSON array with 32-bit offsets.
 
using big_json_array
 JSON array with 64-bit offsets.
 
using json_view_array
 JSON array with view-based storage.
 
using uuid_array
 

Variables

constexpr int SPARROW_EXTENSIONS_VERSION_MAJOR = 0
 
constexpr int SPARROW_EXTENSIONS_VERSION_MINOR = 1
 
constexpr int SPARROW_EXTENSIONS_VERSION_PATCH = 0
 
constexpr int SPARROW_EXTENSIONS_BINARY_CURRENT = 1
 
constexpr int SPARROW_EXTENSIONS_BINARY_REVISION = 0
 
constexpr int SPARROW_EXTENSIONS_BINARY_AGE = 1
 

Typedef Documentation

◆ big_json_array

Initial value:
sparrow::variable_size_binary_array_impl<
sparrow::arrow_traits<std::string>::value_type,
sparrow::arrow_traits<std::string>::const_reference,
std::int64_t,
sparrow::simple_extension<"arrow.json"> json_extension

JSON array with 64-bit offsets.

A variable-size string array for storing JSON-encoded data where the cumulative length of all strings may exceed 2^31-1 bytes. Use this for very large JSON datasets.

The JSON extension type is defined as:

  • Extension name: "arrow.json"
  • Storage type: LargeString (LargeUtf8)
  • Extension metadata: none

Related Apache Arrow specification: https://arrow.apache.org/docs/format/CanonicalExtensions.html#json

See also
json_array for smaller datasets with 32-bit offsets
json_view_array for view-based storage

Definition at line 66 of file json_array.hpp.

◆ bool8_array

using sparrow_extensions::bool8_array = sparrow::primitive_array<int8_t, sparrow::simple_extension<"arrow.bool8">, bool>

Bool8 array using 8-bit storage for boolean values.

Bool8 represents a boolean value using 1 byte (8 bits) to store each value instead of only 1 bit as in the original Arrow Boolean type. Although less compact than the original representation, Bool8 may have better zero-copy compatibility with various systems that also store booleans using 1 byte.

The Bool8 extension type is defined as:

  • Extension name: "arrow.bool8"
  • Storage type: Int8
  • false is denoted by the value 0
  • true can be specified using any non-zero value (preferably 1)
  • Extension metadata: empty string

Definition at line 38 of file bool8_array.hpp.

◆ json_array

Initial value:
sparrow::variable_size_binary_array_impl<
sparrow::arrow_traits<std::string>::value_type,
sparrow::arrow_traits<std::string>::const_reference,
std::int32_t,

JSON array with 32-bit offsets.

A variable-size string array for storing JSON-encoded data where the cumulative length of all strings does not exceed 2^31-1 bytes. This is the standard choice for most JSON datasets.

The JSON extension type is defined as:

  • Extension name: "arrow.json"
  • Storage type: String (Utf8)
  • Extension metadata: none

Related Apache Arrow specification: https://arrow.apache.org/docs/format/CanonicalExtensions.html#json

See also
big_json_array for larger datasets requiring 64-bit offsets
json_view_array for view-based storage

Definition at line 43 of file json_array.hpp.

◆ json_extension

using sparrow_extensions::json_extension = sparrow::simple_extension<"arrow.json">

Definition at line 24 of file json_array.hpp.

◆ json_view_array

Initial value:
sparrow::variable_size_binary_view_array_impl<
sparrow::arrow_traits<std::string>::value_type,
sparrow::arrow_traits<std::string>::const_reference,

JSON array with view-based storage.

A variable-size string view array for storing JSON-encoded data using the Binary View layout, which is optimized for performance by storing short values inline and using references to external buffers for longer values.

The JSON extension type is defined as:

  • Extension name: "arrow.json"
  • Storage type: StringView (Utf8View)
  • Extension metadata: none

Related Apache Arrow specification: https://arrow.apache.org/docs/format/CanonicalExtensions.html#json

See also
json_array for offset-based storage with 32-bit offsets
big_json_array for offset-based storage with 64-bit offsets

Definition at line 90 of file json_array.hpp.

◆ uuid_array

Initial value:
sparrow::fixed_width_binary_array_impl<
sparrow::fixed_width_binary_traits::value_type,
sparrow::fixed_width_binary_traits::const_reference,
UUID array implementation following Arrow canonical extension specification.

Definition at line 82 of file uuid_array.hpp.

Variable Documentation

◆ SPARROW_EXTENSIONS_BINARY_AGE

int sparrow_extensions::SPARROW_EXTENSIONS_BINARY_AGE = 1
constexpr

Definition at line 25 of file sparrow_extensions_version.hpp.

◆ SPARROW_EXTENSIONS_BINARY_CURRENT

int sparrow_extensions::SPARROW_EXTENSIONS_BINARY_CURRENT = 1
constexpr

Definition at line 23 of file sparrow_extensions_version.hpp.

◆ SPARROW_EXTENSIONS_BINARY_REVISION

int sparrow_extensions::SPARROW_EXTENSIONS_BINARY_REVISION = 0
constexpr

Definition at line 24 of file sparrow_extensions_version.hpp.

◆ SPARROW_EXTENSIONS_VERSION_MAJOR

int sparrow_extensions::SPARROW_EXTENSIONS_VERSION_MAJOR = 0
constexpr

Definition at line 19 of file sparrow_extensions_version.hpp.

◆ SPARROW_EXTENSIONS_VERSION_MINOR

int sparrow_extensions::SPARROW_EXTENSIONS_VERSION_MINOR = 1
constexpr

Definition at line 20 of file sparrow_extensions_version.hpp.

◆ SPARROW_EXTENSIONS_VERSION_PATCH

int sparrow_extensions::SPARROW_EXTENSIONS_VERSION_PATCH = 0
constexpr

Definition at line 21 of file sparrow_extensions_version.hpp.