Releases: moj-analytical-services/splink
Releases · moj-analytical-services/splink
v4.0.6
What's Changed
- Explicit selection by @ADBond in #2484
- Fix clustering in debug mode by @ADBond in #2485
- Less caching in debug mode by @ADBond in #2488
- Update changelog by @RobinL in #2497
- remove unnecessary import by @lubrst in #2500
- Spark test session handling by @ADBond in #2504
- Fix count_comparisons_from_blocking_rule by @RobinL in #2503
- Streamline docs by @RobinL in #2505
- Test and fix debug mode by @ADBond in #2481
- Improve compare two records by @RobinL in #2498
- Bug - get columns of DuckDB frame even when table is empty by @ADBond in #2510
- Update CONTRIBUTING.md with correct link by @zmbc in #2513
- Constrain dev pandas version by @ADBond in #2518
- Update lockfile + fixes for latest package versions by @ADBond in #2514
- Avoid bug with checkpointing by switching to parquet by @RobinL in #2525
- Clustering allows match weight args not just match probability by @RobinL in #2454
- Explicit tf columns select by @ADBond in #2527
- Make
Settings._columns_used_by_comparisons
unquoted by @ADBond in #2532 - Pairwise string distance comparison by @zmbc in #2517
- Bias blog 2 by @RossKen in #2408
- 4.0.6 release by @RobinL in #2537
New Contributors
Full Changelog: v4.0.5...v4.0.6
v4.0.5
What's Changed
- add EMA use case by @RobinL in #2468
- Change name of second __splink__cluster_count_row_numbered query, prevent table name conflict by @browo097302 in #2447
- Add iteration number to
neighbours_filtered
table by @ADBond in #2470 - Fix docs examples by @ADBond in #2471
- Docs - correct heading and link text by @ADBond in #2472
- Simplify Altair import by @ADBond in #2479
- Specify version range for
pytest-cov
in CI by @ADBond in #2489 - Compare two records - allow dataframes to be registered by @RobinL in #2493
- 4.0.5 release by @RobinL in #2495
Full Changelog: v4.0.4...v4.0.5
v4.0.4
What's Changed
- Handle threshold_match_probablity 0 in predict() #2420 by @browo097302 in #2425
- Take converged clusters out of play by @RobinL in #2436
- Fix clustering in linky jobs with source dataset column on Postgres by @ADBond in #2444
- Cluster multiple thresholds v2 by @RobinL in #2437
- Used .blocking_rule_sql property match_weights_interactive_history_chart() by @browo097302 in #2446
- restore pretty print of SplinkDataFrame by @RobinL in #2450
- 2440 add docstring to customrule by @RobinL in #2452
- Cluster multiple add stats by @RobinL in #2453
- Score missing intra-cluster edges by @ADBond in #2442
- Fix cluster studio docstring by @ADBond in #2455
- Docs cleanup by @Thomas-Hirsch in #2460
- Fix profile charts issue by @RobinL in #2466
- 4.0.4 release by @RobinL in #2467
New Contributors
- @browo097302 made their first contribution in #2425
Full Changelog: v4.0.3...v4.0.4
v4.0.3
v4.0.2
What's Changed
- Fix performance issue with exploding blocking rules by @RobinL in #2385
- Add cookbook to examples by @RobinL in #2388
- fix docs by @RobinL in #2389
- Create llm prompt by @RobinL in #2366
- 2351 fix spark sampling by @aymonwuolanne in #2390
- Improve number formatting and descriptions on match weight charts by @RobinL in #2392
- add labelling tool by @RobinL in #2393
- Fix ColumnsReversedLevel by @RobinL in #2395
- Add
is_in_level
andcompute_comparison_vector_value
testing functions to internals by @RobinL in #2396 - Migrate tests of comparisons and comparison levels to new testing framework by @RobinL in #2397
- Add AbsoluteDifferenceLevel by @RobinL in #2398
- TimeDifference docstring by @RobinL in #2400
- More levels docstrings by @RobinL in #2401
- add dates docs by @RobinL in #2402
- Better docstrings by @RobinL in #2404
- Add cosine similiarity comparison level and comparison by @RobinL in #2405
- add gov transformation mag link by @RobinL in #2406
- Add cosine similarity tests and allow schemad data by @RobinL in #2407
- Consistency in usage of sql_dialect, sql_dialect_str, sqlglot_dialect by @RobinL in #2391
- ArraySubset comparison level by @RobinL in #2416
- Interactive comparison notebook by @RobinL in #2417
- 4.0.2 release by @RobinL in #2418
Full Changelog: v4.0.1...v4.0.2
v4.0.1
What's Changed
- Bias blog by @ericakane-moj in #2279
- Fix bug in Postgres example by @fhightower in #2352
- Added new use case to index.md by @AnthonyTacquet in #2363
- Fixing issue with reaonly filesystems by @RossHammer in #2357
- Update changelog by @ADBond in #2370
- avoid attempting to cast
Infinity
to double for spark backend by @bkitej-rw in #2372 - Fix Spark 'InfinityD' bug by @ADBond in #2374
- Support duckdbpyrelation as input type by @RobinL in #2375
- Bump actions/download-artifact from 3 to 4.1.7 in /.github/workflows by @dependabot in #2377
- Splink datasets - simplify + restructure by @ADBond in #2378
- Fix docs reference for renamed class by @ADBond in #2380
- Update upload-artifact version in docs CI by @ADBond in #2381
- Allow a specific m and u probabilities to be fixed during training by @RobinL in #2379
- Allow all charts to be generated as a dict by @RossHammer in #2361
- Splink 401 release by @RobinL in #2386
New Contributors
- @probjects made their first contribution in #2172
- @DavidFrenchSG made their first contribution in #2204
- @astimoore made their first contribution in #2229
- @dkaufman-rc made their first contribution in #2240
- @ericakane-moj made their first contribution in #2277
- @bnm3k made their first contribution in #2342
- @fhightower made their first contribution in #2352
- @AnthonyTacquet made their first contribution in #2363
- @RossHammer made their first contribution in #2357
- @bkitej-rw made their first contribution in #2372
Full Changelog: v4.0.0...v4.0.1
v4.0.0
See
https://moj-analytical-services.github.io/splink/blog/2024/07/24/splink-400-released.html
for release announcement
v4.0.0.dev9
What's Changed
- Comparison that has tf adjustments = True properly accounts for column expressions by @RobinL in #2267
- Adjust package top level imports by @ADBond in #2269
- Evaluation docstrings by @RobinL in #2271
- Remove broken EM training options by @ADBond in #2272
- Restore lat-long SQL test by @ADBond in #2273
- Consistent
db_api
argument name by @ADBond in #2278 - Turn off previously configured options by @ADBond in #2276
- Remove jan 1st option from date of birth comparison by @RobinL in #2281
- update release blog by @RobinL in #2284
- Small fixes by @ADBond in #2285
- Update Splink 4 docs by @ADBond in #2283
- update version by @RobinL in #2286
Full Changelog: v4.0.0.dev8...v4.0.0.dev9
Splink 4 dev 8
What's Changed
- Docs links by @RobinL in #2237
- Cherrypick various patches to master by @RobinL in #2241
- Update docstrings splink4 by @RobinL in #2246
- as spark dataframe in docs by @RobinL in #2247
- More docstrings by @RobinL in #2248
- Docstrings 3 by @RobinL in #2250
- Restore spark test mark by @ADBond in #2253
- add note about excludedocs by @RobinL in #2256
- Del accidentally committed testing script by @RobinL in #2258
- Splink 4 release blog v1 by @RobinL in #2235
- Find biggest block by @RobinL in #2260
- Blocking tutorial by @RobinL in #2262
- prevent integer overflow by @RobinL in #2263
- Remove clustering pairwise output format by @ADBond in #2264
- improve blocking below thres by @RobinL in #2265
- splink 4 dev8 release by @RobinL in #2266
Full Changelog: v4.0.0.dev7...v4.0.0.dev8
Dev 7
What's Changed
- Update docs for Splink4 by @RobinL in #2203
- Update comparison template library by @RobinL in #2214
- Further splink4 docs work by @RobinL in #2215
- Move comparison helpers by @RobinL in #2216
- Restore dev guides by @RobinL in #2217
- add back tags by @RobinL in #2218
- Splink4 docs: fix more links by @RobinL in #2225
- Athena linker splink4 migration by @RobinL in #2226
- Athena linker migration 2 by @RobinL in #2227
- Restore Athena example to docs by @RobinL in #2228
- Block to IDs by @RobinL in #2231
- dev7 release by @RobinL in #2236
Full Changelog: v4.0.0.dev6...v4.0.0.dev7