Attempted to dereference unique_ptr that is NULL when inserting many rows #10745
Closed
Description
What happens?
When inserting (in my case) over 80000 rows results in "duckdb.duckdb.FatalException: FATAL Error: Failed to create checkpoint because of error: Attempted to dereference unique_ptr that is NULL!". I first encountered this when I was updating 2000 rows like this
duckdb.sql("CREATE TABLE temp AS SELECT * FROM df")
duckdb.sql("UPDATE target SET col = temp.col FROM temp WHERE target.id = temp.id")
duckdb.sql("DROP TABLE temp")
And I got the same error at DROP TABLE
. I cannot share that code because it is propetiary, but I'll try to see if I can create something else to reproduce it. Below a code snippet that triggers the crash when inserting.
To Reproduce
import duckdb
import random
import string
# Step 1: Generate 2000 random strings of max length 10
def generate_random_strings(n=90000, max_length=10):
random_strings = []
for _ in range(n):
length = random.randint(1, max_length) # Random length up to 10
random_str = ''.join(random.choices(string.ascii_letters + string.digits, k=length))
random_strings.append(random_str)
return random_strings
random_strings = generate_random_strings()
# Step 2: Connect to DuckDB. This will create an in-memory database by default.
# To persist data, you can specify a file name e.g., duckdb.connect('mydata.db')
con = duckdb.connect(database='test.db', read_only=False)
# Step 3: Create a table with an autoincrement ID
con.execute("""
CREATE TABLE test_table (
id INTEGER,
random_string VARCHAR,
emb FLOAT[]
);
""")
con.execute("")
# Insert the strings into the table
insert_query = "INSERT INTO test_table (id, random_string) VALUES (?, ?)"
# DuckDB's executemany is used for batch insertion.
con.executemany(insert_query, [(i, s,) for i, s in enumerate(random_strings)])
# Verify the insertion
result = con.execute("SELECT COUNT(*) FROM test_table").fetchall()
print(f"Inserted rows: {result[0][0]}")
# Optional: Display some inserted rows to verify
sample_result = con.execute("SELECT * FROM test_table LIMIT 5;").fetchall()
print("Sample inserted rows:")
for row in sample_result:
print(row)
# Close the connection (in case of a persistent database)
con.close()
OS:
macOS M3 Pro 14.1
DuckDB Version:
v0.10.0 20b1486
DuckDB Client:
python 3.12.1
Full Name:
Mika Ristimäki
Affiliation:
Private
Have you tried this on the latest nightly build?
I have tested with a nightly build
Have you tried the steps to reproduce? Do they include all relevant data and configuration? Does the issue you report still appear there?
- Yes, I have