Add python code example
Browse files
README.md
CHANGED
|
@@ -2613,11 +2613,15 @@ The Large Language Model (LLM) is not for research and experimentation. We offer
|
|
| 2613 |
|
| 2614 |
## How to get embeddings
|
| 2615 |
|
| 2616 |
-
|
|
|
|
|
|
|
|
|
|
| 2617 |
|
| 2618 |
API Endpoint : https://api.sionic.ai/v1/embedding
|
| 2619 |
|
| 2620 |
-
Example
|
|
|
|
| 2621 |
```shell
|
| 2622 |
curl https://api.sionic.ai/v1/embedding \
|
| 2623 |
-H "Content-Type: application/json" \
|
|
@@ -2626,7 +2630,7 @@ curl https://api.sionic.ai/v1/embedding \
|
|
| 2626 |
}'
|
| 2627 |
```
|
| 2628 |
|
| 2629 |
-
|
| 2630 |
```shell
|
| 2631 |
{
|
| 2632 |
"embedding": [
|
|
@@ -2658,6 +2662,54 @@ Example response:
|
|
| 2658 |
}
|
| 2659 |
```
|
| 2660 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 2661 |
## Massive Text Embedding Benchmark (MTEB) Evaluation
|
| 2662 |
|
| 2663 |
Both versions of Sionic AI's embedding show the state-of-the-art performances on the MTEB!
|
|
|
|
| 2613 |
|
| 2614 |
## How to get embeddings
|
| 2615 |
|
| 2616 |
+
Currently, we open the beta version of embedding API v1 and v2.
|
| 2617 |
+
To get embeddings, you should call API endpoint to send your text.
|
| 2618 |
+
You can send either a single sentence or multiple sentences.
|
| 2619 |
+
The embeddings that correspond to the inputs will be returned.
|
| 2620 |
|
| 2621 |
API Endpoint : https://api.sionic.ai/v1/embedding
|
| 2622 |
|
| 2623 |
+
### Command line Example
|
| 2624 |
+
Request:
|
| 2625 |
```shell
|
| 2626 |
curl https://api.sionic.ai/v1/embedding \
|
| 2627 |
-H "Content-Type: application/json" \
|
|
|
|
| 2630 |
}'
|
| 2631 |
```
|
| 2632 |
|
| 2633 |
+
Response:
|
| 2634 |
```shell
|
| 2635 |
{
|
| 2636 |
"embedding": [
|
|
|
|
| 2662 |
}
|
| 2663 |
```
|
| 2664 |
|
| 2665 |
+
### Python code Example
|
| 2666 |
+
Get embeddings by directly calling Sionic's embedding API.
|
| 2667 |
+
```python
|
| 2668 |
+
def get_embedding(queries: List[str], url):
|
| 2669 |
+
response = requests.post(url=url, json={'inputs': queries})
|
| 2670 |
+
return np.asarray(response.json()['embedding'], dtype=np.float32)
|
| 2671 |
+
|
| 2672 |
+
url = "https://api.sionic.ai/v1/embedding"
|
| 2673 |
+
inputs1 = ["first query", "second query"]
|
| 2674 |
+
inputs2 = ["third query", "fourth query"]
|
| 2675 |
+
embedding1 = get_embedding(inputs1, url=url)
|
| 2676 |
+
embedding2 = get_embedding(inputs2, url=url)
|
| 2677 |
+
similarity = embedding1 @ embedding2.T
|
| 2678 |
+
print(similarity)
|
| 2679 |
+
```
|
| 2680 |
+
|
| 2681 |
+
Using pre-defined [SionicEmbeddingModel]() to obtain embeddings.
|
| 2682 |
+
|
| 2683 |
+
```python
|
| 2684 |
+
import SionicEmbeddingModel
|
| 2685 |
+
|
| 2686 |
+
inputs1 = ["first query", "second query"]
|
| 2687 |
+
inputs2 = ["third query", "fourth query"]
|
| 2688 |
+
model - SionicEmbeddingModel(url="https://api.sionic.ai/v1/embedding",
|
| 2689 |
+
dimension=2048)
|
| 2690 |
+
embedding1 = model.encode(inputs1)
|
| 2691 |
+
embedding2 = model.encode(inputs2)
|
| 2692 |
+
similarity = embedding1 @ embedding2.T
|
| 2693 |
+
print(similarity)
|
| 2694 |
+
```
|
| 2695 |
+
Inspired by [FlagEmbedding](https://github.com/FlagOpen/FlagEmbedding), we also apply instruction to encode short queries for retrieval task.
|
| 2696 |
+
By using `encode_queries()`, you can use instruction to encode queries which is added at the beginning of each query.
|
| 2697 |
+
The instruction to use for both v1 and v2 models is `"query: "`.
|
| 2698 |
+
|
| 2699 |
+
```python
|
| 2700 |
+
import SionicEmbeddingModel
|
| 2701 |
+
|
| 2702 |
+
query = ["first query", "second query"]
|
| 2703 |
+
passage = ["This is a passage related to the first query", "This is a passage related to the second query"]
|
| 2704 |
+
model - SionicEmbeddingModel(url="https://api.sionic.ai/v1/embedding",
|
| 2705 |
+
instruction="query: ",
|
| 2706 |
+
dimension=2048)
|
| 2707 |
+
query_embedding = model.encode(query)
|
| 2708 |
+
passage_embedding = model.encode(passage)
|
| 2709 |
+
similarity = query_embedding @ passage_embedding.T
|
| 2710 |
+
print(similarity)
|
| 2711 |
+
```
|
| 2712 |
+
|
| 2713 |
## Massive Text Embedding Benchmark (MTEB) Evaluation
|
| 2714 |
|
| 2715 |
Both versions of Sionic AI's embedding show the state-of-the-art performances on the MTEB!
|