v0.10.1 Bugfix Release
This is a bug fix release for various issues discovered after we released 0.10.0. There are no new features, just bug fixes. Database files created by DuckDB v0.10.0 or v0.9.* can be read by DuckDB v0.10.1.
What's Changed
- Remove
visualizer
leftovers by @Y-- in #10642 - Add explicit numbering to C enums + various compilation/CI fixes by @Mytherin in #10649
- Disable print method for CSV scanner for R build by @hannes in #10650
- Fix #10548 for the DUCKDB_NO_THREADS case by @carlopi in #10654
- Allow StorageExtension to extend DuckCatalog implementation in order to integration with observability system by @bleskes in #10643
- Update storage info for v0.10.0 by @szarnyasg in #10660
- Revamp duckdb-wasm extensions CI by @carlopi in #10672
- [CI] Re-enable skipped test
window-rows-overflow.test
by @Tishj in #10679 - Catch: prominently display skipped tests by @Mytherin in #10669
- Update Julia to 0.10.0 by @Mytherin in #10689
- Ingestion benchmark framework by @Tmonster in #10341
- [ICU] Add casts from Timestamp_* to TimestampTZ by @Tishj in #9539
- DISTINCT ON - greatly improve performance by rewriting ordered FIRST aggregate into arg_min_null by @Mytherin in #10684
- Fix #10685 - support aliases in join clause by @Mytherin in #10691
- Use assertThrows for throwing assertions in JDBC tests by @peteraisher in #10448
- Casts: report error location in query for failed casts by @Mytherin in #10694
- Fix duckdb spelling in _extension_deploy.yml by @carlopi in #10717
- Fuzzer #1374: ARG_XXX By Decimal by @hawkfish in #10728
- [Python] Rework the python regression test script by @Tishj in #10715
- Removes static member string by @TinyTinni in #10733
- Fuzzer #1372: Order Bind Failure by @hawkfish in #10727
- Fuzzer #1380: To Weeks Overflow by @hawkfish in #10726
- Various fixes by @carlopi in #10708
- Unittest does not satify assertion on MSVC/Debug by @TinyTinni in #10738
- Fix OrderPreservationType issue of MATERIALIZED CTEs by @kryonix in #10587
- Map creation fixes and refactoring by @taniabogatsch in #10436
- Fuzzer #1383: NULL Range Arguments by @hawkfish in #10723
- Fuzzer #1382: Window Stats Overflow by @hawkfish in #10725
- Comment on view columns by @samansmink in #10710
- Union exclude by @Tmonster in #10688
- move the logic for immediate_transaction_mode to the physical operator by @peterboncz in #10739
- [C API] Small fix and more tests by @taniabogatsch in #10748
- List_slice bug fix by @maiadegraaf in #10747
- Enable azure autoload by @samansmink in #10746
- Parquet writer - reduce memory usage of order-preserving write by @Mytherin in #10756
- Refactor csv reader includes because of r path length limitations by @hannes in #10658
- Arrow String View Type by @pdet in #10481
- local_file_system.cpp: minor fix for macOS libproc code by @barracuda156 in #10758
- Make unnamed_subquery naming predictable by @Mytherin in #10765
- [Python] Fix overflow issue in PandasAnalyzer by @Tishj in #10768
- Throw when trying to consume over 128 byte decimals by @pdet in #10601
- CLI: Right-align numerics in markdown tables by @Mytherin in #10767
- Fuzzer #1399: Window NULL RANGE by @hawkfish in #10776
- [CSV Sniffer] Minor sniffer tweak to give preference to dialects that generate the least errors if ignore_errors = true by @pdet in #10777
- Add large benchmark directory by @Tmonster in #10763
- Improve UNPIVOT error messages, and allow expressions in unpivot by @Mytherin in #10773
- Add a method UUID::FromUHugeint to generate a UUID from a uhugeint_t by @Mytherin in #10771
- [CSV Reader] Add lock to buffer reset by @pdet in #10791
- [WINDOWS] Add "/bigobj" that solves compile issue during debug by @maiadegraaf in #10782
- fix: not over-call AllSecrets by @stephaniewang526 in #10807
- Update readme by @szarnyasg in #10814
- [ODBC] Rework Connect to the ODBC driver and add functionality to set all DuckDB configurations in the Connection String by @maiadegraaf in #10692
- Fix arrow conversion, map doesn't support large offset by @yiyuanliu in #10796
- [CSV Reader] Spinlock over GetLine Error + New Strategy for dialect candidates by @pdet in #10755
- Trivial fixes by @carlopi in #10816
- Fix unicode handling in underscore of LIKE operator by @Mytherin in #10821
- JDBC spurious CI failure - an exception being thrown in this test is a race condition by @Mytherin in #10825
- Benchmark runner - allow files (e.g. CSV/Parquet) to be cached using the cache command by @Mytherin in #10817
- Fix #10803 - correctly reclaim space of list indexes when columns are dropped by @Mytherin in #10822
- [Upsert]
INSERT OR REPLACE
fixes by @Tishj in #10789 - [Dev] Add an optional time out in seconds to
run_tests_one_by_one.py
by @Tishj in #10744 - Maintain names in COLUMNS(*) expression, and allow aliasing multiple columns using {column} by @Mytherin in #10774
- Disable AWS/Azure on Windows for now by @Mytherin in #10827
- [Dev] Bump memory limit on batch_memory_usage.test_slow by @carlopi in #10845
- Fix coverity apt-get by @carlopi in #10838
- minor: FixedSizeBuffer::Pin move shared_ptr rather than copying by @mapleFU in #10837
- ci: Upgrade workflows to actions/setup-python@v5 by @krlmlr in #10832
- Fuzzer #1389: ARG_XXX Decimal Casts by @hawkfish in #10742
- Contributor guide: Fix new issue link by @szarnyasg in #10836
- Changing source to src in relational_constraints query by @Dtenwolde in #10848
- Fix: correctly calculate the range of build side for perfect hash join by @gitccl in #10446
- [Python] Fix issue caused by deadlock between
thread.join()
and acquiring the GIL by @Tishj in #10854 - [CSV Parser] 8-Byte Skipping instead of 1-Byte when possible by @pdet in #10855
- Add components of the version to duckdb.hpp by @ahuarte47 in #10840
- [CSV Sniffer] Tweaking header detection by @pdet in #10714
- Check if directory exists before removing files in regression test runner by @Tmonster in #10859
- [Extension] Add CatalogType to the list of functions generated in
extension_entries.hpp
by @Tishj in #10597 - Regression test build side probe side by @Tmonster in #10585
- [Arrow] Fix issue surrounding lifetime of dictionary arrays by @Tishj in #10610
- Fix #10745 - correctly deal with empty float columns in floating point compression routines by @Mytherin in #10863
- [Extensions] Build fixes by @carlopi in #10860
- Fix MSVC linking issue with workaround by @samansmink in #10865
- Reduce memory usage & avoid spilling to disk unnecessarily for order-preserving table creation/insertion by @Mytherin in #10862
- pb/avoid GetSchema opening a transaction by @peterboncz in #10740
- Support dollar-quoted string-constants in the CLI by @Mytherin in #10879
- Shell: avoid printing "Error: " prefix if the error message already has a prefix (e.g. Binder Error:, Parser Error:, etc) by @Mytherin in #10880
- Partially fix #10751: correctly catch exceptions in sqlite3_print_duckbox by @Mytherin in #10881
- Reset
refresh
in CompressedFile::Close() by @Maxxen in #10882 - [CSV Reader] Lock when getting progress by @pdet in #10884
- [CSV Sniffer] Early out if things go wrong in dialect detection by @pdet in #10872
- [CSV Reader] Fix for skipping mix of newline delimiters by @pdet in #10864
- Parallelize format.py script by @hatvik in #10646
- Remove Old PSQLODBC scripts by @maiadegraaf in #10888
- Add update_odbc_path.py to ODBC bundle by @maiadegraaf in #10895
- Improve Wasm.yml workflow by @carlopi in #10899
- [Parquet] Fix #10829, write correct data page offset in the presence of dictionaries by @hannes in #10890
- Table name binding does not fail for non-existent tables in DROP TABLE statements by @NiclasHaderer in #10893
- Fix #10889 - correctly deal with compressed vectors in struct filterpushdown of ColumnSegment::FilterSelection by @Mytherin in #10896
- CI: Disable julia nightly for now by @Mytherin in #10905
- CLI - add support for rendering errors/matching brackets for square ([]) and curly ({}) brackets as well by @Mytherin in #10904
- Storage: Fix an internal exception that could be triggered when deleting many rows and checkpointing repeatedly by @Mytherin in #10897
- LIMIT/OFFSET clean-up by @Mytherin in #10873
- Add ARRAY to test_all_types + IO and some clients by @Maxxen in #10850
- Use M1 (ARM) OSX runners by @hannes in #10670
- build: restore tarball build support by @Mause in #10900
- Fix #10902 - allow more expressions to be used with an indirection without brackets (. or [], etc) by @Mytherin in #10909
- feat(jdbc): fixed size array support by @Mause in #10911
- Add regexp_split_to_table macro by @szarnyasg in #10898
- [MetaTransaction] Add lock on modifying
all_transactions
andtransactions
by @Tishj in #10799 - Issue #10809: RANGE Hint Corrections by @hawkfish in #10828
- Enable the progress bar (without printing) in unittests by @Mytherin in #10908
- [Python][Dev] Fix issue in
read_csv
related to the s3 extension by @Tishj in #10690 - Add support for the C API duckdb_query function to the Julia api by @rdavis120 in #10886
- Fix #10501 - in LocalFileSystem::Write split writes into batches of at most 2GB by @Mytherin in #10912
- Correctly reset data chunk in RETURNING of DELETE by @Mytherin in #10915
- bitstring_agg had a trigger-able assertion, [duckdb-fuzzer/#1414] by @hannes in #10918
- Shell: Remove IEE754 function from CLI by @Mytherin in #10919
- Use correct index in string to nested cast error handling by @Mytherin in #10920
- Batch memory manager - keep track of all used memory correctly and enforce that unflushed memory is correctly set to 0 when we are finished by @Mytherin in #10922
- Fix #9975 - correctly open (and keep open) a transaction when checking if prepared statement needs to be rebound by @Mytherin in #10923
- CLI - Insert spaces when copy-pasting tabs by @Mytherin in #10924
- feat: exposing ssl ca cert path to httpfs by @pvaezi in #10704
- Centralize dynamic cast check and disable on MacOS by @Mytherin in #10925
- Checked Numeric Casts by @hannes in #10870
- Avoid running numeric cast checks when CRASH_ON_ASSERT is enabled by @Mytherin in #10942
- Set duckdb_api to 'python jupyter' if in Jupyter notebook by @guenp in #10931
- Array fuzzer issue fixes by @Maxxen in #10944
- Fix assertion trigger in FilterCombiner::AddTransitiveFilters by @Mytherin in #10941
- Support recursive describe queries (i.e. DESCRIBE(DESCRIBE ..)) by @Mytherin in #10945
- Avoid throwing null pointer exception in Window Segment Tree destructor by @Mytherin in #10937
- Fix an issue where partitions were not correctly considered in bound window expression equality by @Mytherin in #10939
- Fix for limit % with subquery on an empty table by @Mytherin in #10946
- Correctly visit all expressions during lateral join decorrelation, particularly with nested lateral joins by @Mytherin in #10936
- when you add the relation, make sure you call gettableIndexes() on th… by @Tmonster in #10949
- Internal #1428: Interval Subtract Overflow by @hawkfish in #10957
- Fuzzer #1445: Trap MAKE_DATE/TIME Overflows by @hawkfish in #10958
- Python.yml: Revert to macos-latest for OSX workflow by @carlopi in #10970
- Purge queue refactor by @taniabogatsch in #10594
- Change time from duckdb_time to duckdb_time_struct in duckdb_time_tz_struct by @Giorgi in #10933
- Add require block_sizes 262144 on tests reading db files by @carlopi in #10974
- [duckdb-fuzzer/#1368] - overflow in bitstring_agg on hugeint & uhugei… by @hannes in #10971
- Fix LIST->ARRAY TRY_CAST when list sizes mismatch by @Maxxen in #10973
- Add a micro extended benchmark. by @Tmonster in #10943
- [Dev] Move
TemporaryFileManager
and friends out of StandardBufferManager by @Tishj in #10938 - [ODBC] Reorganize Directory Structure by @maiadegraaf in #10979
- Internal #1385: Window Partition Collation by @hawkfish in #10985
- Fuzzer #1471: Trap MAKE_DATE Overflows by @hawkfish in #10987
- Range checks for ACOS by @hannes in #10972
- improve CheckBoundaryValues in TopN by @xuke-hat in #10955
- CSV tests - use TEST_DIR to prevent leaking file by @Mytherin in #10991
- Autoload INET and ICU (and add back sqlite and postgres as autoloadable) by @carlopi in #10948
- Fuzzer #1468: Window RANGE Types by @hawkfish in #10990
- Fuzzer #1446: Quantile Hugeint Interpolation by @hawkfish in #10983
- Override git hash / git version by @carlopi in #10977
- [Storage] Only call FinalizeOptimisticWriter after storage merge has succeeded by @Mytherin in #10998
- Add MetaTransaction::GetTransaction to threadsan suppressions (false positive) by @Mytherin in #11001
- Various fixes: CMake + generated extension_entries.hpp checks by @carlopi in #10994
- Nightly Wasm build fix by @taniabogatsch in #10993
- Fix return null constant in array_slice and other array issues by @Maxxen in #10992
- Add correct table bindings for window relations. by @Tmonster in #10997
- Clean up ExecutorTask and simplify waiting for all tasks to be cancelled by @Mytherin in #11005
- Various JSON thread sanitizer fixes by @Mytherin in #11004
- Fix warnings in ALP and logical_insert by @carlopi in #11008
- Check IsLoaded() before importing cached item by @Tmonster in #11007
- Issue #10995: ICU VARCHAR TIMETZ by @hawkfish in #11002
- Fix #10982 - only update total rows of row group collection after we finish appending to prevent other readers from attempting to initialize scans on in-progress appends by @Mytherin in #11011
- [Nightly] Block size nightly test changes by @taniabogatsch in #11010
- ATTACH with reserved names (temp/main) by @Mytherin in #11020
- Fix various tests for vector_size = 2 by @Mytherin in #11027
- [CI] Create a bigger table in interrupt test by @Mytherin in #11025
- Issue #10995: TIMETZ DST Fix by @hawkfish in #11024
- Fixup LinuxRelease.yml release: unittester was not invoked correctly by @carlopi in #11022
- In DatabaseInstance destructor - destroy TaskScheduler first by @Mytherin in #11021
- Refactor ATTACH options by @taniabogatsch in #11016
- Fix #11033: don't reset arena allocator in between calls to streaming window expression by @Mytherin in #11039
- Avoid checking LastModifiedTime for remote files in object cache by @Mytherin in #11034
- [Block Size Nightly] Enable more block size nightly tests by @taniabogatsch in #11036
- Add missing pipeline dependencies in recursive CTE by @kryonix in #11043
- Use batch limit only when limit + offset are small constants by @Mytherin in #11035
- Add New CSV Error for Invalid Unicode by @pdet in #10984
- [FIX] Lambda bug in subqueries by @taniabogatsch in #11046
- [Swift] performance optimisations by @tcldr in #11052
- add concat_ws to spark API by @nicornk in #11051
- feat(jdbc): expose comments via jdbc methods by @Mause in #11031
- [CI / Tests] Disable CSV sniffer test for smaller vector sizes and reduce block-size nightly runtime by @taniabogatsch in #11055
- [CSV Sniffer] Consider date/timestamp formats from the user when sniffing by @pdet in #11057
- Extend the "Contents of view were altered" error with more information by @Tishj in #11064
- [Python] Add some numeric and string functions to spark API by @mariotaddeucci in #11067
- [ODBC] Allow multiple statements to be executed using SQLExecDirect by @maiadegraaf in #11038
- [CSV Reader] Apply projection on over buffer values. by @pdet in #11056
- Python: use short paths for Windows by @Mytherin in #11068
- Fix #10752: Add support for Parquet encryption on Windows by @Mytherin in #11069
- [Python] Code Quality - PEP8 Compliant + only relevant imports by @mariotaddeucci in #11070
- [CI] Add patch argument to patch the extension's sources before building by @krlmlr in #10831
- fix(jdbc): support fractional seconds in getTime by @Mause in #10707
- Fix #11071 - correctly report progress when scanning multiple Parquet files by @Mytherin in #11072
- Add ipv6 inet + minor fixes by @carlopi in #11073
- Implement IPv6 support in the inet extension. by @troycurtisjr in #10839
- [CI] Add step to verify C API enum integrity. by @Tishj in #10664
- Issue #10995: TIMETZ DST Fix by @hawkfish in #11079
- Check Nested Types for UTF-8 Correctness by @pdet in #11086
- Add
scope
column toduckdb_settings
by @Tishj in #11017 - Partitioned write - flush batches periodically (every 500K rows) instead of only writing when all data has been gathered by @Mytherin in #10976
- [Python] Add
extract_statements
and the Statement class by @Tishj in #10891 - [Python] Improve performance of conversion to Numpy/Pandas for nested lists by @Tishj in #10826
- Fix an InternalException caused by DICTIONARY_VECTOR inside
map_from_entries
by @Tishj in #11091 - Fix #11084 - fixes an issue with the Parquet writer when writing vectors of lists with repeated list elements (as can be generated through a join) by @Mytherin in #11094
- Add callbacks for newly added connections, and allow extensions to rebind queries as a result of planning failures by @Mytherin in #11096
- Fuzzer #2376: INTERVAL Muliply Overflow by @hawkfish in #11100
- Make test/sql/copy/csv/test_limit_spinlock.test a slowtest by @pdet in #11088
- Fix #11063 - avoid throwing exception in InClauseRewriter by @Mytherin in #11090
- Add a hint on how to resolve lockups when using ninja. by @troycurtisjr in #11074
- Fix LocalFileSystem::Read/Write, update location after read/write some data by @yiyuanliu in #11105
- run_tests_one_by_one - add a default timeout of 1 hour by @Mytherin in #11104
- [Fix] Aliases in subqueries by @taniabogatsch in #11103
- [CSV Scan] Implement ignore_erros for Dates/Timestamps/Decimals by @pdet in #11083
- Fix TaskScheduler deadlock on NumberOfThreads by @Tishj in #11093
- Fix return null constant in list_resize and list_aggr by @maiadegraaf in #11111
- Add rowsort to more tests for queries that don't have a defined order by @Mytherin in #11110
- Do not extract filters that cannot be hyper edges (Join Order Optimizer) by @Tmonster in #11108
- Add Dictionary vector verification by @Mytherin in #11114
- [Dev][Python] Make
test_httpfs.py
error test more lenient by @Tishj in #11125 - Fix
ConstantVector::Reference
for dictionary arrays by @Maxxen in #11136 - Disable jemalloc for ARM distributions and clean up when closing DB by @lnkuiper in #11130
- Case senstivity issue secret manager by @samansmink in #11128
- Bump az aw vcpkg by @samansmink in #11127
- Fix broken micro benchmarks so they can be run weekly by @Tmonster in #11113
- Refactor OSX.yml, now inputs can be provided by @carlopi in #11133
- Merge main into feature by @Tishj in #11141
- Revert "Merge main into feature" by @Mytherin in #11145
- Fix upload assets script by @carlopi in #11144
- Bump spatial by @Maxxen in #11132
- Fix upload assets OSX/2 by @carlopi in #11148
- Add more vector type verification tests/settings by @Mytherin in #11138
- Allow for customization of catalog lookup behavior for different catalog types by @Mytherin in #11151
- [Python][Arrow] Don't deduplicate column names when outputting to Arrow by @Tishj in #11160
- More Array and Union fixes by @Maxxen in #11161
- Refactor upload logic (towards staged releases) by @carlopi in #11156
- Fuzzer #2490: Generate NULL TIMESTAMPTZ by @hawkfish in #11143
- Add folder parameter to upload logic and upload also twine artifacts by @carlopi in #11169
- CI: Find mirror issues among all issues, not just open issues by @szarnyasg in #11170
- Fix
TupleDataCollection
serialization of dictionary vectors containing nested data by @lnkuiper in #11174 - Allow overriding of git describe also in scripts (via OVERRIDE_GIT_DESCRIBE environment variable) by @carlopi in #11179
- Python staged releases: centralized staged upload by @carlopi in #11187
- Fix RevertAppendInternal by @Mytherin in #11177
- TwineUpload: Add awscli dependency + minor rework by @carlopi in #11193
- Fix issue in copy constructor of ExtraDropSecretInfo by @samansmink in #11190
- Unify CSV/JSON and Parquet Batch Writing Code - and fix memory management issues in CSV/JSON writing by @Mytherin in #11188
- More conservative dummy list entry estimation by @Maxxen in #11185
- retry on 500 error by @samansmink in #11184
- Fix warning on unused utf_type by @carlopi in #11198
- Remove outdated duckdb-node related scripts by @carlopi in #11180
- Add StagedUpload.yml by @carlopi in #11189
- Improving CSV Casting error message by @pdet in #11183
- Improve conversion error message in Parquet reader by @Mytherin in #11199
- Fix thread sanitizer issues by @Mytherin in #11200
- Internal #1564: Range Join DISTINCT by @hawkfish in #11205
- small fix secret autoloading, bump azure by @samansmink in #11182
- CSV reader - suggest enabling null_padding and ignore_errors in case of missing columns by @Mytherin in #11201
- CI: Create/label mirror issue job should list all internal issues by @szarnyasg in #11204
- [Python] Add
IS NULL
/IS NOT NULL
support to Expression API by @cmdlineluser in #11175 - Remove old assertions in SegmentTree by @Mytherin in #11208
- [lambda] Fix for list_reduce giving the wrong result by @maiadegraaf in #11171
- [Dev] Fix various issues discovered by #11137 by @Tishj in #11210
- [Dev] Fix py override describe by @carlopi in #11209
- Sanitize CSV Newline identifier for writing CSV files by @pdet in #11106
- Fix persistent secret file permissions by @samansmink in #11172
- Retry Binding Prior To Execution by @Mytherin in #11149
- Avoid copying LogicalType in FlatVector::SetNull. by @yiyuanliu in #11214
- Review of CI on tags + add R extensions CI to InvokeCI.yml by @carlopi in #11212
- Fix python and apply patches + bump extensions by @carlopi in #11217
- Disable jemalloc on arm in Python package as well by @Mytherin in #11218
Full Changelog: v0.10.0...v0.10.1