You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The default json filter escapes non-ASCII characters by converting them into Unicode strings during the serialization of JSON objects. However, in our use case with the Fluid library, where we construct prompts to pass to a large language model, the converted Unicode strings become difficult for the model to process. Therefore, we would prefer the json filter to avoid escaping non-ASCII characters during serialization.
For example: 你好,这是一条短信 will be converted to \u4F60\u597D\uFF0C\u8FD9\u662F\u4E00\u6761\u77ED\u4FE1. But the Unicode string is harder for the model to understand since the model is pretrained on more text string data instead of unicode string data.
It would be better if the json filter in the Fluid library allowed us to configure the serialization options for non-ASCII characters.
The text was updated successfully, but these errors were encountered:
The default
json
filter escapes non-ASCII characters by converting them into Unicode strings during the serialization of JSON objects. However, in our use case with the Fluid library, where we construct prompts to pass to a large language model, the converted Unicode strings become difficult for the model to process. Therefore, we would prefer thejson
filter to avoid escaping non-ASCII characters during serialization.For example:
你好,这是一条短信
will be converted to\u4F60\u597D\uFF0C\u8FD9\u662F\u4E00\u6761\u77ED\u4FE1
. But the Unicode string is harder for the model to understand since the model is pretrained on more text string data instead of unicode string data.It would be better if the
json
filter in the Fluid library allowed us to configure the serialization options for non-ASCII characters.The text was updated successfully, but these errors were encountered: