Open
Description
- loss scoring
- multi-perceptor
- weighted multi-perceptor
- cutout methods? + augs? make that an independent library maybe?
- perceptor weight interpolations/schedules - https://discord.com/channels/729741769192767510/730484623028519072/956979309686423602
- API should be agnostic wrt media type, i.e. contrasting modalities could both be text, or one be audio and other video, etc.
- optionally augment w positional information/embeddings?
- Maybe some minimal translation API to facilitate use by non-english users and conversely support for non-english models
- see aphantasia's SBERT utilization: https://github.com/eps696/aphantasia
- Check for installed/available CLIP, use vendored if not available
Metadata
Assignees
Labels
No labels