`paradedb.exists` missing rows when querying text fields.

### What happens?

I run this query to get entries that have been transcribed:
```sql
SELECT
    c0."id", c0."filename", c0."call_length", c0."transcript"
FROM "calls" AS c0
WHERE
    (c0."id" @@@ paradedb.exists('transcript'::paradedb.fieldname))
    AND (c0."system_id" @@@ 'moco_md_ps')
ORDER BY c0."id" DESC;
```
*For context, there's a brief window of time where a call has been uploaded but has yet to be transcribed. It's desirable not to display these.*


ID `25062` is missing in the results, even though its transcript is present:
```text
-[ RECORD 50 ]----------------------------------------------------------------------------------------------------------------------------------
id          | 25063
filename    | 6000-1733067200_852912500.1-call_36.wav
call_length | 2
transcript  | █████████████████████
-[ RECORD 51 ]----------------------------------------------------------------------------------------------------------------------------------
id          | 25061
filename    | 6020-1733067184_852912500.1-call_34.wav
call_length | 8
transcript  | ██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████
```
*I've redacted them for privacy sake.*

When I remove the `paradedb.exists` clause:
```sql
SELECT
    c0."id", c0."filename", c0."call_length", c0."transcript"
FROM "calls" AS c0
WHERE
    (c0."system_id" @@@ 'moco_md_ps')
ORDER BY c0."id" DESC;
```

ID `25602` appears as expected:
```text
-[ RECORD 52 ]-----------------------------------------------------------------
id          | 25063
filename    | 6000-1733067200_852912500.1-call_36.wav
call_length | 2
transcript  | █████████████████████
-[ RECORD 53 ]-----------------------------------------------------------------
id          | 25062
filename    | 6000-1733067166_851337500.1-call_29.wav
call_length | 27
transcript  | ██████████████████████████████████████████████████████████████...
-[ RECORD 54 ]-----------------------------------------------------------------
id          | 25061
filename    | 6020-1733067184_852912500.1-call_34.wav
call_length | 8
transcript  | ██████████████████████████████████████████████████████████████...
```

The problem occurs independent of the `(c0."system_id" @@@ 'moco_md_ps')` clause's presence.

### To Reproduce

I haven't figured out what exactly causes this to happen. It seems to affect longer passages of text, but perhaps that's just bias because their absence stands out more.

### OS:

Ubuntu 24.04.1 LTS, x64, Ryzen 7900X

### ParadeDB Version:

v0.13.0

### Are you using ParadeDB Docker, Helm, or the extension(s) standalone?

ParadeDB Docker Image

### Full Name:

Cameron Duley

### Affiliation:

My own behalf

### Did you include all relevant data sets for reproducing the issue?

No - I cannot share the data sets because they are confidential

### Did you include the code required to reproduce the issue?

- [X] Yes, I have

### Did you include all relevant configurations (e.g., CPU architecture, PostgreSQL version, Linux distribution) to reproduce the issue?

- [X] Yes, I have

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`paradedb.exists` missing rows when querying text fields. #1994

What happens?

To Reproduce

OS:

ParadeDB Version:

Are you using ParadeDB Docker, Helm, or the extension(s) standalone?

Full Name:

Affiliation:

Did you include all relevant data sets for reproducing the issue?

Did you include the code required to reproduce the issue?

Did you include all relevant configurations (e.g., CPU architecture, PostgreSQL version, Linux distribution) to reproduce the issue?

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

paradedb.exists missing rows when querying text fields. #1994

Description