Batch Operations¶
Batch persistence¶
By default, Hypabase auto-saves to SQLite after every mutation. For bulk inserts, use batch() to defer persistence until the block exits:
with hb.batch():
for i in range(1000):
hb.node(f"entity_{i}", type="item")
hb.edge([f"entity_{i}", "catalog"], type="belongs_to")
# Single save at the end, not 2000 saves
Note
batch() provides batched persistence, not transaction rollback. If an exception occurs mid-batch, partial in-memory changes remain and persist when the batch exits.
Nested batches¶
Batches can nest. Only the outermost batch triggers a save:
with hb.batch():
hb.node("a", type="x")
with hb.batch():
hb.node("b", type="x")
hb.node("c", type="x")
# No save yet — inner batch exited but outer batch is still open
hb.node("d", type="x")
# Save happens here — outermost batch exits
Upsert by vertex set¶
upsert_edge_by_vertex_set() finds an existing edge by its exact set of nodes, or creates a new one. This is useful for idempotent ingestion:
# First call creates the edge
edge = hb.upsert_edge_by_vertex_set(
{"dr_smith", "patient_123", "aspirin"},
edge_type="treatment",
properties={"date": "2025-01-15"},
source="clinical_records",
confidence=0.95,
)
# Second call finds the existing edge (same vertex set)
edge = hb.upsert_edge_by_vertex_set(
{"dr_smith", "patient_123", "aspirin"},
edge_type="treatment",
properties={"date": "2025-01-16"}, # updates properties
)
Custom merge function¶
Pass a merge_fn to control how the upsert merges properties:
def merge_latest(existing_props, new_props):
return {**existing_props, **new_props}
hb.upsert_edge_by_vertex_set(
{"a", "b"},
edge_type="link",
properties={"count": 2},
merge_fn=merge_latest,
)
Cascade delete¶
Delete a node and all its incident edges in one call:
node_deleted, edges_deleted = hb.delete_node_cascade("patient_123")
# node_deleted: True if the node existed
# edges_deleted: number of edges removed
Compare with delete_node(), which only removes the node itself:
Bulk ingestion pattern¶
Combine batch() and context() for efficient bulk loading:
with hb.batch():
with hb.context(source="data_import_v2", confidence=0.9):
for record in records:
hb.edge(
record["entities"],
type=record["relation_type"],
properties=record.get("metadata", {}),
)
This gives you:
- Single disk write at the end (
batch) - Consistent provenance across all edges (
context) - Auto-created nodes for any new entity IDs