Update default postgresql.conf values in Dockerfile

### What happens?

Search performance in v0.14.0 appears to have significantly degraded compared to v0.13.2, with execution time being over 200 times slower.

### To Reproduce

A [public dataset](https://www.kaggle.com/datasets/bitandatom/social-network-fake-account-dataset) is used to demonstrate the issue. The dataset was imported 10 times to create a table with ~12 million rows, amplifying the effect. Any sufficiently large dataset should show similar behavior.

**Table schema:**
```
CREATE TABLE legitimate_account
(
    id serial primary key,
    content TEXT
);
```
The `content` column was imported from `legitimate_account.csv` and repeated 10 times.
```
dataset=# SELECT COUNT(*) FROM legitimate_account;
  count   
----------
 12296170
(1 row)
```

**Index creation:**
```
CREATE INDEX search_idx ON legitimate_account
USING bm25 (id, content)
WITH (key_field = 'id');
```
Even though the dataset is in Chinese, the default tokenizer was used, eliminating potential tokenizer-related interference. I later tested with `chinese_lindera` tokenizer, and the results were similar.

**Query used:** 
```
EXPLAIN ANALYZE SELECT * FROM legitimate_account WHERE "content" @@@ '新闻' LIMIT 1000;
```

**Performance comparison**
v0.13.2:
```
Limit  (cost=10.00..2010.00 rows=1000 width=169) (actual time=1.057..6.852 rows=1000 loops=1)
  ->  Custom Scan (ParadeDB Scan) on legitimate_account  (cost=10.00..2010.00 rows=1000 width=169) (actual time=1.056..6.749 rows=1000 loops=1)
        Table: legitimate_account
        Index: search_idx
        Heap Fetches: 1000
        Exec Method: NormalScanExecState
        Scores: false
        Tantivy Query: {"with_index":{"oid":86688,"query":{"parse_with_field":{"field":"content","query_string":"新闻","lenient":null,"conjunction_mode":null}}}}
Planning Time: 10.886 ms
Execution Time: 13.211 ms
```

v0.14.0:
```
Limit  (cost=10.00..2010.00 rows=1000 width=169) (actual time=22.786..41.805 rows=1000 loops=1)
  ->  Custom Scan (ParadeDB Scan) on legitimate_account  (cost=10.00..2010.00 rows=1000 width=169) (actual time=22.784..41.735 rows=1000 loops=1)
        Table: legitimate_account
        Index: search_idx
        Heap Fetches: 1000
        Exec Method: TopNScanExecState
        Scores: false
           Top N Limit: 1000
        Tantivy Query: {"with_index":{"oid":78357,"query":{"parse_with_field":{"field":"content","query_string":"新闻","lenient":null,"conjunction_mode":null}}}}
Planning Time: 2702.707 ms
Execution Time: 2304.358 ms
```

Both versions were tested in newly created containers with no persistent storage. Each container started from scratch, and the dataset was re-imported in both cases.

Docker image used: `17-v0.14.0` and `17-v0.13.2`

### OS:

Linux

### ParadeDB Version:

v0.14.0

### Are you using ParadeDB Docker, Helm, or the extension(s) standalone?

ParadeDB Docker Image

### Full Name:

(prefer not to say)

### Affiliation:

N/A

### Did you include all relevant data sets for reproducing the issue?

Yes

### Did you include the code required to reproduce the issue?

- [X] Yes, I have

### Did you include all relevant configurations (e.g., CPU architecture, PostgreSQL version, Linux distribution) to reproduce the issue?

- [X] Yes, I have

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update default postgresql.conf values in Dockerfile #2111

What happens?

To Reproduce

OS:

ParadeDB Version:

Are you using ParadeDB Docker, Helm, or the extension(s) standalone?

Full Name:

Affiliation:

Did you include all relevant data sets for reproducing the issue?

Did you include the code required to reproduce the issue?

Did you include all relevant configurations (e.g., CPU architecture, PostgreSQL version, Linux distribution) to reproduce the issue?

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development