Skip to content

optimize the row-position read performance of TsFileDataFrame#794

Open
ycycse wants to merge 3 commits intoapache:developfrom
ycycse:optimize-python-dataset-row-read
Open

optimize the row-position read performance of TsFileDataFrame#794
ycycse wants to merge 3 commits intoapache:developfrom
ycycse:optimize-python-dataset-row-read

Conversation

@ycycse
Copy link
Copy Markdown
Member

@ycycse ycycse commented Apr 25, 2026

This PR optimizes Python TsFileDataFrame / Timeseries row-position reads.

Before this change, single-series position reads used row-by-row ResultSet.next() iteration. This PR switches that path to native row-query Arrow batch reads and avoids timestamp materialization when the caller only needs values.

TsFile direct reads improve from 143.2 to 536.2 samples/s, about 3.7x faster after this pr.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant