Açıklaması şöyle
Apache Arrow Flight is a high-performance RPC framework designed specifically for transferring large amounts of columnar data over a network. Unlike ODBC/JDBC, it eliminates the need for intermediate serialization steps, significantly reducing transfer latency and increasing throughput.
Arrow Flight SQL
Açıklaması şöyle
Apache Arrow Flight SQL extends Arrow Flight by providing a standardized interface for SQL-based interactions with databases. This means that developers can benefit from Arrow’s high-speed data transfers while maintaining a familiar SQL interface. Unlike traditional database protocols, Flight SQL enables direct execution of SQL queries over the high-performance Arrow Flight transport layer, eliminating unnecessary serialization overhead and reducing query latency.
Apache Arrow Database Connectivity (ADBC)
Açıklaması şöyle
ADBC provides a standardized API for database interactions, making it easier for developers to query and work with databases using Arrow-native data (with/without Flight SQL).
Adoption of ADBC
DuckDB, dbt, Snowflake ADBC destekliyor.
Apache Arrow vs Apache Parquet
Açıklaması şöyle
The Apache Arrow format project began in February 2016, focusing on columnar in-memory analytics workload. Unlike file formats like Parquet or CSV, which specify how data is organized on disk, Arrow focuses on how data is organized in memory.
Maven
Şu satırı dahil ederiz
<dependency><groupId>org.apache.arrow</groupId><artifactId>arrow-memory</artifactId><version>6.0.1</version></dependency><dependency><groupId>org.apache.arrow</groupId><artifactId>arrow-vector</artifactId><version>6.0.1</version></dependency>
Örnek
Yazma için şöyle yaparız
import org.apache.arrow.memory.RootAllocator;import org.apache.arrow.vector.*;import org.apache.arrow.vector.ipc.*;import org.apache.arrow.vector.util.*;// Set up the allocator and the schema for the vectortry (RootAllocator allocator = new RootAllocator(Integer.MAX_VALUE);VarCharVector vector = new VarCharVector("vector", allocator);ArrowWriter writer = new ArrowWriter(vector, new Schema(Collections. singletonList(vector.getField())))) {// Write data to the vectorvector.setSafe(0, "Apache".getBytes());vector.setSafe(1, "Arrow".getBytes());vector.setSafe(2, "Java".getBytes());vector.setValueCount(3);// Write vector to a filetry (FileOutputStream out = new FileOutputStream("arrow-data.arrow")) {writer.writeArrow(out.getChannel());}}
Okuma için şöyle yaparız
// Now, let's read the data we just wrotetry (RootAllocator allocator = new RootAllocator(Integer.MAX_VALUE);ArrowReader reader = new ArrowReader(new FileInputStream("arrow-data.arrow") .getChannel(), allocator)) {// Read schema and load the datareader.loadNextBatch();// Get the vectortry (VarCharVector vector = (VarCharVector) reader.getVectorSchemaRoot() .getVector("vector")) {// Iterate over the values in the vectorfor (int i = 0; i < vector.getValueCount(); i++) {System.out.println(new String(vector.get(i)));} } }