Apache Commons Çorbası: Temmuz 2023

18 Temmuz 2023 Salı

Apache CarbonData

Giriş

Açıklaması şöyle

Apache CarbonData is an indexed columnar data format that is developed specifically for big data scenarios where fast analytics and real-time insights are critical.

Deep Integration with Spark

Açıklaması şöyle

CarbonData has been deeply integrated with Apache Spark, providing Spark SQL’s query optimization techniques and using its Code Generation capabilities. This makes it possible to directly query CarbonData files using Spark SQL, hence giving faster and more efficient query results.

Multi-Layered Structure

Açıklaması şöyle

Apache CarbonData is structured in multiple layers, which includes the table, segment, block, and page levels. This hierarchical structure allows efficient data retrieval by skipping irrelevant data during the query execution.

Table: A table is a collection of segments, and each segment represents a set of data files.

Segment: A segment contains multiple data blocks, where each block can store a significant amount of data.

Block: A block is divided into blocklets. Each blocklet holds a series of column pages, which are organized column-wise.

Page: The page level is where the actual data is stored. The data in these pages is encoded and compressed, making data retrieval efficient.

Avro Compabitability

Giriş

Açıklamaların detayı burada

1. Geriye Uyumluluk

İleride olanları ilgilendirir. Geriye uyumluluk, en son sürüm, geriden gelen sürümü ile üretilen veriyi okuyabilir demek

BACKWARD

En son iki schema geriye doğru uyumludur.

BACKWARD_TRANSITIVE

Tüm schema'lar geriye doğru uyumludur.

2. İleriye Uyumluluk

Geriden gelenleri ilgilendirir. İleriye Uyumluluk, gerideki schema, ilerideki schema tarafından üretilen veriyi okuyabilir demek

FORWARD

En son iki schema ileriye doğru uyumludur. En son schema tarafından üretilen veriyi sondan bir önceki okuyabilir demek.

FORWARD_TRANSITIVE

Tüm schema'lar ileriye doğru uyumludur.

3. Full compatibility

FULL

En son 2 schema birbirlerinin verilerini okuyabilirler

FULL_TRANSITIVE

Herkes birbirlerinin verilerini okuyabilirler