AWS bedrock Cohere embedding - got error "expected maxLength: 2048" #3942
Labels
👻 feat:rag
Embedding related issue, like qdrant, weaviate, milvus, vector database.
stale
Issue has not had recent activity or appears to be solved. Stale issues will be automatically closed
Self Checks
1. Is this request related to a challenge you're experiencing?
Can our embedding model has an property maxLength, similar as context_size. Then split text into chunks by maxLength.
bedrock Cohere embedding: Its "context_size" is 512.
It says: 1 token is about 4 characters, so its limication is 512 tokens.
But it is not true, in fact it can deal with 1024 tokens. Its hard limitation is 2048 characters.
We got the senario: I have 2500 characters, while has only 300 tokens.
expected maxLength: 2048, actual: 2459
2. Describe the feature you'd like to see
Model configuration, choose one property: {maxLength|context_size}, or add a unit {tokens|characters}
Model configuration, enable alternative property: {maxLength|context_size}, or add a property unit {tokens|characters}
3. How will this feature improve your workflow or experience?
to fix cohere error: expected maxLength: 2048
4. Additional context or comments
No response
5. Can you help us with this feature?
The text was updated successfully, but these errors were encountered: