New implementation of bulk insert by hvlad · Pull Request #8990 · FirebirdSQL/firebird

hvlad · 2026-04-15T14:15:26Z

In Firebird 5 parallel restore was introduced. It contains a "bulk insert" code that allows to lower contention on PP by concurrent writers. Also it makes each writer to use own dedicated DP to fill with records.

With shared metadata cache in v6 that code became broken and parallel restore creates a lot of unused data pages. Instead of fixing that first attempt to have bulk insert ability, I offer a new approach that fixes the issue, have better performance and could be used more widely.

Note, patch doesn't remove old code, it could be done a bit later - after agreement on the new code.

dyemanov · 2026-04-15T14:39:07Z

Some details about how the new approach is different from the old one would be appreciated.

hvlad · 2026-04-15T14:54:52Z

The insert code path now avoids a lot of not necessary steps, such as triggers, validations (except of NOT NULL on field level) index maintenance - all of this is abcent during restore. Also, all records inserted into dedicated in-memory buffer to avoid endless latches on data page buffers. When in-memory buffer is full, its contents is copied into actual DB buffers and go to the disk in usual way. In-memory buffer size start from 1 page and resized to the 8 pages when first 8 pages of relation is filled. Also, blob contens are put into separate in-memory buffer that works in the same way.

The code is put into two main classes:

BulkInsert - implements in-memory buffers and code that works with records (mostly analog/copy of some DPM parts) , and
BulkInsertNode - replaces StoreNode and contains some further optimizations, such as pre-calculated target descriptors.

AlexPeshkoff · 2026-04-15T15:03:22Z

On 4/15/26 17:55, Vlad Khorsun wrote: *hvlad* left a comment (FirebirdSQL/firebird#8990) <#8990 (comment)> The insert code path now avoids a lot of not necessary steps, such as triggers, validations (except of NOT NULL on field level) index maintenance - all of this is abcent during restore. Also, all records inserted into dedicated in-memory buffer to avoid endless latches on data page buffers. When in-memory buffer is full, its contents is copied into actual DB buffers and go to the disk in usual way. In-memory buffer size start from 1 page and resized to the 8 pages when first 8 pages of relation is filled. Also, blob contens are put into separate in-memory buffer that works in the same way. The code is put into two main classes: * |BulkInsert| - implements in-memory buffers and code that works with records (mostly analog/copy of some DPM parts) , and * |BulkInsertNode| - replaces |StoreNode| and contains some further optimizations, such as pre-calculated target descriptors.

But what about |relPages->rel_*_data_space| stored at database, not attachment level? From new code it seems that they should be moved back to attachment from database? Am I missing something?

hvlad · 2026-04-15T15:30:29Z

But what about relPages->rel_*_data_space stored at database, not attachment level? From new code it seems that they should be moved back to attachment from database? Am I missing something?

These fields not used in bulk insert code and thus was not affected in this patch.

At next step I'm going to:

leave as is:

	ULONG rel_index_root;		// index root page number
	USHORT rel_pg_space_id;

make atomic:

	ULONG rel_data_pages;		// count of relation data pages
	ULONG rel_slot_space;		// lowest pointer page with slot space
	ULONG rel_pri_data_space;	// lowest pointer page with primary data page space
	ULONG rel_sec_data_space;	// lowest pointer page with secondary data page space

remove:

	ULONG rel_last_free_pri_dp;	// last primary data page found with space
	ULONG rel_last_free_blb_dp;	// last blob data page found with space

aafemt · 2026-04-16T12:06:38Z

I think term "bulk insert" is a little misleading here. Oracle uses "direct-path insert" for the path where data go straight into data pages.

hvlad · 2026-05-01T14:12:39Z

@dyemanov: did you have a chance to look at it ?

dyemanov · 2026-05-01T15:21:52Z

@dyemanov: did you have a chance to look at it ?

Halfway done, hopefully to be completed during the weekend.

dyemanov

The only thing (besides my other comments) that I don't really like is a code duplication between dpm.epp and BulkInsert.cpp. For example, storage of a big record seems to share about 80% of code. Do you think it could be improved (i.e. by adding separate routines to dpm.epp that deal with a preallocated page buffer)?

dyemanov · 2026-05-01T15:26:37Z

+	auto node = FB_NEW_POOL(pool) BulkInsertNode(pool);
+
+	if (csb->csb_blr_reader.peekByte() == blr_marks)
+		/*node->marks |= */PAR_marks(csb);


If marks are skipped, why blr_marks can be generated after blr_bulk_insert? Or is it just reserved for possible future use?

Yes, just in case it will be used later

dyemanov · 2026-05-01T15:33:32Z

+	tail.csb_flags |= csb_store;
+
+	jrd_rel* const relation = tail.csb_relation(tdbb);
+	if (relation)


How could a relation be missing here, given that the target stream is always RelationSourceNode or LocalTableSourceNode? Maybe an assert instead? Or at least add an fb_assert(false) to the else part?

Will add assert, thanks.

dyemanov · 2026-05-02T18:13:17Z

+		record->setTransactionNumber(transaction->tra_number);
+
+		assignValues(tdbb, request, relation, record, toDescs.begin());
+		cleanupRpb(tdbb, rpb);


I suppose cleanupRpb() makes no sense after record->nullify() which not only set NULL flags but also zeroes the record data.

Agree, will remove it.

dyemanov · 2026-05-02T18:17:42Z

+		bulk->putRecord(tdbb, rpb, transaction);
+		REPL_store(tdbb, rpb, transaction);
+
+		JRD_reschedule(tdbb);


JRD_reschedule() is usually called per every record inside cursor->fetchNext() -> RecordSource::internalGetRecord(), is it really necessary calling it again after store?

Ok, will remove

dyemanov · 2026-05-02T18:18:43Z

+	}
+	auto bulk = transaction->getBulkInsert(tdbb, relation, true);
+	assignValues(tdbb, request, relation, record, impure->descs->begin());
+	cleanupRpb(tdbb, rpb);


Same as above re. cleanupRpb()

dyemanov · 2026-05-02T18:20:47Z

+				ERR_post(Arg::Gds(isc_not_valid) << Arg::Str(name) << Arg::Str(NULL_STRING_MARK));
+			}
+
+			record->setNull(toField->fieldId);


assignValues() seems to be always called after record->nullify(), so all fields are already initialized to NULLs and it shouldn't be necessary to do that again.

dyemanov · 2026-05-02T18:25:50Z

+	record->nullify();
+	record->setTransactionNumber(transaction->tra_number);
+
+	if (!(impure->flags & INIT_DONE))


It looks like you could check for !impure->descs and thus avoid impure->flags at all?

dyemanov · 2026-05-02T18:31:21Z

 inline constexpr UCHAR ppg_dp_swept			= 0x04;		// Sweep has nothing to do on data page
 inline constexpr UCHAR ppg_dp_secondary		= 0x08;		// Primary record versions not stored on data page
 inline constexpr UCHAR ppg_dp_empty			= 0x10;		// Data page is empty
+inline constexpr UCHAR ppg_dp_reserved		= 0x20;		// Slot is reserved for bulk insert


If the slot was reserved, but the server crashed before the slot has been actually used, can it be reused for regular (non-bulk) operations? If not, shouldn't gfix clear this flag if such a case is detected?

If the slot was reserved, but the server crashed before the slot has been actually used, can it be reused for regular (non-bulk) operations?

No, it can't

If not, shouldn't gfix clear this flag if such a case is detected?

Yes, and validation always clears it, see:

Validation::walk_data_page() never set ppg_dp_reserved at pp_bits

thus, Validation::walk_pointer_page() always clear this bit

dyemanov · 2026-05-03T07:16:23Z

+		CCH_RELEASE(tdbb, &dpWindow);
+
+		if (m_isPrimary)
+			tdbb->bumpStats(RecordStatType::INSERTS, m_relation->getId(), m_current->dpg_count);


Why is this record-level counter increased per page rather than per record (inside putRecord(), I guess)?

Until flush() records actually not yet inserted into relation.
Also, this is a kind of small optimization.

hvlad · 2026-05-03T17:40:04Z

The only thing (besides my other comments) that I don't really like is a code duplication between dpm.epp and BulkInsert.cpp. For example, storage of a big record seems to share about 80% of code.

Yes, this is not an easy question and I considered it, of course. The current code is best compromiіe I can think of.
I think the amount of duplicated code is relatively small.
Do you have clear idea how to not duplicate code and not create an additional branches in existing one ?

Do you think it could be improved (i.e. by adding separate routines to dpm.epp that deal with a preallocated page buffer)?

You mean - move (half of) BulkInsert into DPM ? :)

hvlad · 2026-05-03T17:41:36Z

I'll implement suggested changes and re-test, then commit its.

New implementation of bulk insert

df4d5b3

hvlad self-assigned this Apr 15, 2026

Fixed bug

3f4ee1c

dyemanov self-requested a review April 15, 2026 16:54

dyemanov requested changes May 3, 2026

View reviewed changes

Uh oh!

Conversation

hvlad commented Apr 15, 2026

Uh oh!

dyemanov commented Apr 15, 2026

Uh oh!

hvlad commented Apr 15, 2026

Uh oh!

AlexPeshkoff commented Apr 15, 2026 via email

Uh oh!

hvlad commented Apr 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

aafemt commented Apr 16, 2026

Uh oh!

hvlad commented May 1, 2026

Uh oh!

dyemanov commented May 1, 2026

Uh oh!

dyemanov left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

hvlad commented May 3, 2026

Uh oh!

hvlad commented May 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

hvlad commented Apr 15, 2026 •

edited

Loading