-
Notifications
You must be signed in to change notification settings - Fork 773
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
'Merge_Models' with new topic_model from outliers #2222
Comments
I have a hard time understanding what you exactly mean here. Could you give an example? Perhaps showcase what is happening and what you would expect to happen?
The representative documents are indeed displayed as NaN since |
Hi @MaartenGr and @OnAnd0n , let me share my code as an example here, as I also try to extract outliers from a topic model, calculate a new topic model using the outliers only and then merge both models. What I tried so far:
Is this a possible approach? And why do I not get e.g. the topic frequencies for the merged model? This: |
Could you share a bit more information? Which version of BERTopic are you using? Can you share the output? You mention that it does not work, but what exactly does that mean? Do you get an error or perhaps unexpected results? |
I would like to utilize 'Merge_Models' in BERTopic to re-cluster the outliers with HDBScan and merge them with the existing topics.
However, there are currently some challenges with the Merge_Models functionality:
When merging the Topic_model (including all data, with outliers) and the Out_Topic_model (consisting only of outliers), the 'Count' of the Topic_model for -1 increases by the number of outliers, instead of effectively concat them.
The Representative_docs are displayed as NaN.
=> is the only way?
My BERTopic Version is 0.16.3
How can these issues be resolved?
The text was updated successfully, but these errors were encountered: