Skip to content

V0.5.0/flight#176

Merged
thanos merged 14 commits intomainfrom
v0.5.0/flight
Apr 16, 2026
Merged

V0.5.0/flight#176
thanos merged 14 commits intomainfrom
v0.5.0/flight

Conversation

@thanos
Copy link
Copy Markdown
Owner

@thanos thanos commented Apr 16, 2026

No description provided.

thanos added 14 commits April 14, 2026 23:24
  Add Flight SQL NIF layer (flight_sql.rs) with connect/query/execute/ - closed #127
  stream_schema/stream_next NIFs, - closed #128
  Elixir facade (Client), - closed #129
  behaviour (ClientBehaviour), - closed #130
  NIF-backed impl (ClientImpl), error type (Error) - closed #130,
  materialised result type (Result), options parser, and top-level docs - closed #131.

  Extends ExArrow.Stream with :flight_sql backend. Adds 34 unit test - closed #132
  using Mox for isolation; real-connect tests cover options validation without requiring a live server. - #133
  Fix 1 — options.ex: IPv6 URI support - closed #137

  - Added bracketed IPv6 parsing: [::1]:32010 is now handled via a Regex.run path.
  - Extracted parse_port/3 helper shared by both paths.
  - Removed dead 0:0:0:0:0:0:0:1 from @loopback_hosts (non-canonical notation, never reachable). Kept ::1 (produced by bracket stripping) and ip6-localhost.
  - Updated docstring to document the three accepted URI forms.

  Fix 2 — client.ex:119: wrong error code on from_stream/1 failure - closed #138

  - Simplified query/2 to with {:ok, stream} <- ..., do: Result.from_stream(stream). The typed %Error{} from from_stream/1 propagates directly; no re-wrapping with the
  wrong :transport_error code.

  Fix 3 — result.ex:50: silent stream-error discard - closed #139

  - Replaced Stream.to_list/1 (which raises on error) with a private collect_batches/2 that uses Stream.next/1 in a loop and returns {:error, %Error{}} on mid-stream
  failure.
  - Schema errors are wrapped via wrap_schema_error/1 using :protocol_error.
  - Updated @SPEC from_stream/1 to return {:error, Error.t()} instead of {:error, String.t()}.

  Fix 4 — stream.ex:98: structured error info lost - closed #140

  - Changed {:error, {_code, _status, msg}} -> {:error, msg} to {:error, {code, _status, msg}} -> {:error, "[#{code}] #{msg}"}. The gRPC code is preserved in the error
  message while keeping the String.t() contract of next/1.

  Fix 5 — flight_sql.rs:149–159: incomplete gRPC code mapping - closed #141

  - Added explicit mappings for Cancelled, Unavailable, DeadlineExceeded → :transport_error; OutOfRange → :invalid_argument; Unknown, ResourceExhausted, FailedPrecondition,
   Aborted, AlreadyExists, DataLoss → :server_error. Reduces misleading :server_error for transport-class codes.

  Fixes 6 & 7 — flight_sql.rs / client.ex: serialization documentation - closed #142 closed #143

  - Added a doc comment on flight_sql_query NIF explaining the Mutex serialisation and recommending separate handles for concurrent workloads.
  - Added a > #### Concurrency warning admonition to Client.stream_query/2.

  Fix 8 — error.ex:55: grpc_status type mismatch - closed #145

  - Widened grpc_status: non_neg_integer() | nil to integer() | nil to correctly permit negative gRPC status codes that the NIF may theoretically emit.
  - ExArrow.FlightSQL.Options — no dedicated test module. Every TLS variant (nil+loopback, nil+remote, true, false, keyword, invalid), URI edge cases (no port, bad port,
  extra colons, IPv6), and headers validation deserves direct unit tests. Currently only indirectly exercised through Client.connect error paths.
  - ExArrow.FlightSQL.ClientImpl — no direct tests. wrap_nif_error/1 has three clauses (tuple, binary, fallback) — the fallback is untested.
  - Client.query/2 success path — the with mock — query/2 test (client_test.exs:82) calls Client.stream_query, not Client.query. The describe label is misleading. There's
  no test that exercises Client.query/2 successfully returning a %Result{}.
  - Client.query!/2 success path — only the error/raise path is covered.
  - Client.close/1 real impl — only the mock test exists; the default no-op :ok return from ClientImpl.close/1 is untested.
  - ExArrow.Stream :flight_sql clauses (schema/1, next/1) — no tests. At minimum, pattern-match tests on the {:error, triple} → {:error, msg} path via mock.
  - Result.from_stream/1 — no test for a stream that yields batches successfully (only mock tests in client_test skip the materialisation).

  Documentation issues

  - client.ex:74 — "defaults to port 31337" — this is not the Flight SQL default. Arrow's conventional Flight SQL port is 32010. The number 31337 looks like a
  leftover/joke. Either use 32010 or remove the default (require explicit port).
  - flight_sql.ex:44 — "Flight SQL server implementation" listed as deferred is ambiguous; clarify that ExArrow provides a client only.
  - client_behaviour.ex:2 — has a public @moduledoc, but this is an internal behaviour (referenced via Application.get_env). Either use @moduledoc false or expose as a
  formally supported extension point with version guarantees.
  - result_test.exs:26-44 — "Explorer not available" describe-block is mislabelled (Explorer IS available in test env, so the test exercises the rescue path). Rename or
  split into two tests (one with Explorer, one simulating its absence via Code.ensure_loaded? stub).
  - result_test.exs:63-78 — same mislabelling: "Nx not available" tests the rescue path instead.
  - client.ex:50 — "Tested against DuckDB Flight SQL server (v0.10+) and DataFusion" — unless there are actual integration tests tagged flight_sql_integration, this is
  aspirational. Qualify or remove.
  - Commit message 7035b95 — contains spurious - closed #127, - closed #128, etc. littered through the body, clearly a template glitch. Worth fixing via git commit --amend
  if unpushed, or noting for next commit.
  - result.ex:60-63 — docstring claims "IPC round-trip ... correct for all standard Arrow types, but may be lossy for decimal128, map" — this is a strong claim without a
  reference or test demonstrating it.
  flight_sql.rs — Added ok and error to the rustler::atoms! block. Replaced Atom::from_str(env, "error").unwrap() in encode_sql_error and Atom::from_str(env, "ok").unwrap()
   in flight_sql_execute with the generated error() and ok() atom functions. Both call sites are now zero-allocation and consistent with the rest of the module.

  result.ex — Replaced the two-clause with (which only reads as "unwrap two results in sequence") with nested case expressions. The control flow is now explicit: schema
  error short-circuits, batch error short-circuits, success builds the struct.

  client.ex — Added @compile {:inline, [impl: 0]} so the Application.get_env lookup is inlined at each call site by the compiler rather than dispatching through an extra
  stack frame on every public function call.

  options.ex — Replaced the terse comment on @loopback_hosts with one that explains why it's a module attribute (single source of truth, shared across callers) and notes
  the IPv6 canonical form convention.

  client_impl.ex — Added require Logger and a Logger.warning/1 call in the fallback wrap_nif_error/1 clause. If the NIF ever returns an unexpected shape, the log line will
  surface it immediately rather than silently encoding it as :transport_error and making the caller chase a phantom transport error.
…closed #164, closed #165, closed #166

  Implement the Enumerable protocol for ExArrow.Stream so all Enum and
  Stream functions work directly on stream handles returned by
  stream_query/2.  Adds halt (partial consumption), suspend (lazy), and
  error-raising semantics.  Expands the Flight SQL guide with Enum-first
  streaming examples and resource lifecycle documentation.
…losed #167

  Add three metadata discovery functions backed by new dirty-IO NIFs. - closed #168
  A shared flight_info_to_stream helper in Rust deduplicates the
  FlightInfo → DoGet → stream path across all three endpoints. - closed #169
  All functions return lazy ExArrow.Stream values consumable with Enum. - closed #170
  Add ExArrow.FlightSQL.Statement with execute/1 and execute_update/1,
  backed by three new dirty-IO NIFs (flight_sql_prepare,
  flight_sql_prepared_execute, flight_sql_prepared_execute_update). - closed #172

  Lift FlightSqlClientHandle.client from Mutex<T> to Arc<Mutex<T>> so the
  prepared statement resource can share the original channel for DoGet
  calls after PreparedStatement::execute returns a FlightInfo. - closed #173

Refactor flight_info_to_stream to accept &Arc<Mutex<Client>> i - closed #174

The fix looks correct. When Pythonx is not available:
  - from_py/1 delegates to Adbc.Helper.from_py(nil) which returns {:error, %RuntimeError{...}} with the "pythonx not available" message
  - from_py!/1 calls from_py(nil), destructures the guaranteed {:error, reason}, and raises it

  The type system can now infer the return type of each clause unambiguously, eliminating the "clause will never match" warning.
  Tests (flight_sql_result_test.exs) — 8 new tests (15 total, up from 7):
  - to_dataframe/1 real NIF round-trip: returns Explorer.DataFrame, correct row count, correct column names
  - to_tensor/2 real NIF round-trip: returns Nx.Tensor for "id" (int64), returns conversion error for non-numeric "name", returns conversion error for unknown column

  Both test blocks are guarded with Code.ensure_loaded? so they compile-away cleanly when Explorer/Nx aren't in the environment.

  lib/ex_arrow/flight_sql.ex — Updated scope section: added prepared statements, metadata, Explorer/Nx to supported list; removed stale "Prepared statements (v0.6.0)"
  deferred item.

  docs/flight_sql_guide.md — Updated v0.5.0 scope supported list to include Explorer/Nx integration entries; removed the duplicate "Prepared statements" from the deferred
  list.
 - closed #177
 - closed #178
 - closed #179
  Adds full Arrow Flight SQL support: Client.connect/query/stream_query/
  execute/prepare plus get_tables/get_db_schemas/get_sql_info metadata.
  ExArrow.Stream now implements Enumerable. Structured errors, Mox-compatible
  ClientBehaviour, Explorer/Nx Result integration, and a comprehensive guide.

  Also fixes an Elixir 1.17+ typing warning in deps/adbc (unreachable
  {:ok, stream_ref, capsule} clause in Adbc.Result.from_py/1 when
  Pythonx is not loaded).

  485 tests, 0 failures.

  Phase 7 is complete. Here is what was done:

  Version bumps
  - mix.exs: 0.4.0 → 0.5.0
  - native/ex_arrow_native/Cargo.toml: 0.4.0 → 0.5.0

  CHANGELOG.md — new [0.5.0] - 2026-04-16 entry covering all added modules, the Enumerable impl, the ADBC fix, and the docs changes.

  RELEASE_NOTES_0_5_0.md — new file following the same format as prior release notes; covers quick start, TLS/auth, lazy streaming, prepared statements, metadata,
  Explorer/Nx integration, Mox testing, error codes, new public API table, changed behaviour, bug fixes, and upgrade guide.

  README.md
  - Updated tagline to mention Flight SQL
  - Added {:ex_arrow, "~> 0.5.0"} in install examples
  - Added Flight SQL quick-start snippet
  - Added full ## Arrow Flight SQL: remote query servers section with connection, query, streaming, TLS, auth, prepared statements, metadata, Explorer/Nx, and Mox examples
  - Updated mermaid architecture diagram to show Flight SQL as a distinct path
  - Added [Shipped (v0.5.0)] roadmap entry; moved Flight SQL out of "Longer-term"
  - Added docs/flight_sql_guide.md to the Documentation list

  docs/overview.md — added Flight SQL guide row to the guide table and ExArrow.FlightSQL.Client/Result rows to the optional integrations table.
Root cause: arrow-flight's FlightSqlServiceClient converts every tonic::Status error into ArrowError::IpcError(format!("{status:?}")), embedding the gRPC code as a debug
  string. The old code wrapped these as FlightError::Arrow(e), which our flight_error_to_term mapped uniformly to :transport_error with grpc_status=0.

  Fix (flight_sql.rs):

  1. tonic_code_to_atom — extracted the tonic::Code → (atom, grpc_int) mapping out of flight_error_to_term so it can be reused.
  2. parse_tonic_status_debug — parses tonic::Status's derived Debug format (Status { code: VariantName, message: "...", ... }) to recover the code and message.
  3. arrow_error_to_term — for ArrowError::IpcError that matches the tonic pattern, emits the fully structured (code, grpc_status, message) triple; falls back to
  :transport_error for genuine Arrow errors.
  4. All 10 call sites that previously did flight_error_to_term(env, FlightError::Arrow(e)) now call arrow_error_to_term(env, &e) directly.

  The FlightRecordBatchStream path (flight_sql_stream_next) already went through FlightError::Tonic and was unaffected. 485 tests pass.
  - result.ex:102: Code.ensure_loaded?(ExArrow.Explorer) → Code.ensure_loaded?(Explorer.DataFrame)
  - result.ex:153: Code.ensure_loaded?(ExArrow.Nx) → Code.ensure_loaded?(Nx)

  ExArrow.Explorer and ExArrow.Nx are always compiled as part of this library, so the old guards were trivially true. The real optional deps are Explorer.DataFrame (from
  the :explorer Hex package) and Nx (from :nx), matching the pattern used in ExArrow.Explorer itself (@explorer_available Code.ensure_loaded?(Explorer.DataFrame)).
…error" test to assert %Error{code: :unavailable, grpc_status: 14, message: "server gone"} instead of

  the old code: :transport_error + string matching. This aligns with the issue 183 fix where collect_batches now propagates structured gRPC triples directly.

  lib/ex_arrow/flight_sql/options.ex — Two bug fixes that the new edge-case tests exposed:
  - Empty string "" is now rejected (was accepted as a host-only URI defaulting to port 32010)
  - Port strings with leading + or - signs are now rejected (Integer.parse/1 accepts them but they aren't valid port syntax)

  test/ex_arrow/flight_sql_integration_test.exs — Fixed the skip mechanism from the non-existent ExUnit.SkipError to ExUnit.Assertions.flunk/1.

  lib/ex_arrow/flight_sql/client.ex — Removed the unused ClientBehaviour alias.

 - closed #182, closed #191, closed #192
…till pass. Here's what was done:

  1. .sobelow-conf — Generated via mix sobelow --skip --save-config. The skip: true key activates # sobelow_skip comment processing, which was previously only enabled with
  the --skip CLI flag.
  2. client.ex — The # sobelow_skip ["SQL.Query"] comment is now correctly placed between @SPEC and def query!, with a clarifying comment. The skip mechanism works by
  sobelow converting # sobelow_skip to a @sobelow_skip module attribute during file preprocessing, then pairing it with the immediately following function definition via
  reversed-AST traversal.
@thanos thanos merged commit b0c21df into main Apr 16, 2026
40 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant