Conversation
|
Some details about how the new approach is different from the old one would be appreciated. |
|
The insert code path now avoids a lot of not necessary steps, such as triggers, validations (except of NOT NULL on field level) index maintenance - all of this is abcent during restore. Also, all records inserted into dedicated in-memory buffer to avoid endless latches on data page buffers. When in-memory buffer is full, its contents is copied into actual DB buffers and go to the disk in usual way. In-memory buffer size start from 1 page and resized to the 8 pages when first 8 pages of relation is filled. Also, blob contens are put into separate in-memory buffer that works in the same way. The code is put into two main classes:
|
|
On 4/15/26 17:55, Vlad Khorsun wrote:
*hvlad* left a comment (FirebirdSQL/firebird#8990)
<#8990 (comment)>
The insert code path now avoids a lot of not necessary steps, such as
triggers, validations (except of NOT NULL on field level) index
maintenance - all of this is abcent during restore. Also, all records
inserted into dedicated in-memory buffer to avoid endless latches on
data page buffers. When in-memory buffer is full, its contents is
copied into actual DB buffers and go to the disk in usual way.
In-memory buffer size start from 1 page and resized to the 8 pages
when first 8 pages of relation is filled. Also, blob contens are put
into separate in-memory buffer that works in the same way.
The code is put into two main classes:
* |BulkInsert| - implements in-memory buffers and code that works
with records (mostly analog/copy of some DPM parts) , and
* |BulkInsertNode| - replaces |StoreNode| and contains some further
optimizations, such as pre-calculated target descriptors.
But what about |relPages->rel_*_data_space| stored at database, not
attachment level? From new code it seems that they should be moved back
to attachment from database? Am I missing something?
|
These fields not used in bulk insert code and thus was not affected in this patch. At next step I'm going to: leave as is: make atomic: remove: |
|
I think term "bulk insert" is a little misleading here. Oracle uses "direct-path insert" for the path where data go straight into data pages. |
|
@dyemanov: did you have a chance to look at it ? |
Halfway done, hopefully to be completed during the weekend. |
dyemanov
left a comment
There was a problem hiding this comment.
The only thing (besides my other comments) that I don't really like is a code duplication between dpm.epp and BulkInsert.cpp. For example, storage of a big record seems to share about 80% of code. Do you think it could be improved (i.e. by adding separate routines to dpm.epp that deal with a preallocated page buffer)?
| auto node = FB_NEW_POOL(pool) BulkInsertNode(pool); | ||
|
|
||
| if (csb->csb_blr_reader.peekByte() == blr_marks) | ||
| /*node->marks |= */PAR_marks(csb); |
There was a problem hiding this comment.
If marks are skipped, why blr_marks can be generated after blr_bulk_insert? Or is it just reserved for possible future use?
There was a problem hiding this comment.
Yes, just in case it will be used later
| tail.csb_flags |= csb_store; | ||
|
|
||
| jrd_rel* const relation = tail.csb_relation(tdbb); | ||
| if (relation) |
There was a problem hiding this comment.
How could a relation be missing here, given that the target stream is always RelationSourceNode or LocalTableSourceNode? Maybe an assert instead? Or at least add an fb_assert(false) to the else part?
| record->setTransactionNumber(transaction->tra_number); | ||
|
|
||
| assignValues(tdbb, request, relation, record, toDescs.begin()); | ||
| cleanupRpb(tdbb, rpb); |
There was a problem hiding this comment.
I suppose cleanupRpb() makes no sense after record->nullify() which not only set NULL flags but also zeroes the record data.
| bulk->putRecord(tdbb, rpb, transaction); | ||
| REPL_store(tdbb, rpb, transaction); | ||
|
|
||
| JRD_reschedule(tdbb); |
There was a problem hiding this comment.
JRD_reschedule() is usually called per every record inside cursor->fetchNext() -> RecordSource::internalGetRecord(), is it really necessary calling it again after store?
| } | ||
| auto bulk = transaction->getBulkInsert(tdbb, relation, true); | ||
| assignValues(tdbb, request, relation, record, impure->descs->begin()); | ||
| cleanupRpb(tdbb, rpb); |
| ERR_post(Arg::Gds(isc_not_valid) << Arg::Str(name) << Arg::Str(NULL_STRING_MARK)); | ||
| } | ||
|
|
||
| record->setNull(toField->fieldId); |
There was a problem hiding this comment.
assignValues() seems to be always called after record->nullify(), so all fields are already initialized to NULLs and it shouldn't be necessary to do that again.
| record->nullify(); | ||
| record->setTransactionNumber(transaction->tra_number); | ||
|
|
||
| if (!(impure->flags & INIT_DONE)) |
There was a problem hiding this comment.
It looks like you could check for !impure->descs and thus avoid impure->flags at all?
| inline constexpr UCHAR ppg_dp_swept = 0x04; // Sweep has nothing to do on data page | ||
| inline constexpr UCHAR ppg_dp_secondary = 0x08; // Primary record versions not stored on data page | ||
| inline constexpr UCHAR ppg_dp_empty = 0x10; // Data page is empty | ||
| inline constexpr UCHAR ppg_dp_reserved = 0x20; // Slot is reserved for bulk insert |
There was a problem hiding this comment.
If the slot was reserved, but the server crashed before the slot has been actually used, can it be reused for regular (non-bulk) operations? If not, shouldn't gfix clear this flag if such a case is detected?
There was a problem hiding this comment.
If the slot was reserved, but the server crashed before the slot has been actually used, can it be reused for regular (non-bulk) operations?
No, it can't
If not, shouldn't
gfixclear this flag if such a case is detected?
Yes, and validation always clears it, see:
Validation::walk_data_page()never setppg_dp_reservedatpp_bits- thus,
Validation::walk_pointer_page()always clear this bit
| CCH_RELEASE(tdbb, &dpWindow); | ||
|
|
||
| if (m_isPrimary) | ||
| tdbb->bumpStats(RecordStatType::INSERTS, m_relation->getId(), m_current->dpg_count); |
There was a problem hiding this comment.
Why is this record-level counter increased per page rather than per record (inside putRecord(), I guess)?
There was a problem hiding this comment.
Until flush() records actually not yet inserted into relation.
Also, this is a kind of small optimization.
Yes, this is not an easy question and I considered it, of course. The current code is best compromiіe I can think of.
You mean - move (half of) BulkInsert into DPM ? :) |
|
I'll implement suggested changes and re-test, then commit its. |
In Firebird 5 parallel restore was introduced. It contains a "bulk insert" code that allows to lower contention on PP by concurrent writers. Also it makes each writer to use own dedicated DP to fill with records.
With shared metadata cache in v6 that code became broken and parallel restore creates a lot of unused data pages. Instead of fixing that first attempt to have bulk insert ability, I offer a new approach that fixes the issue, have better performance and could be used more widely.
Note, patch doesn't remove old code, it could be done a bit later - after agreement on the new code.