telepix
/

PIXIE-Rune-Preview

@@ -3,13 +3,13 @@ tags:
 - sentence-transformers
 - sentence-similarity
 - feature-extraction
 pipeline_tag: sentence-similarity
 library_name: sentence-transformers
 ---
-# SentenceTransformer
-This is a [sentence-transformers](https://www.SBERT.net) model trained. It maps sentences & paragraphs to a 1024-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
 ## Model Details
@@ -20,14 +20,8 @@ This is a [sentence-transformers](https://www.SBERT.net) model trained. It maps
 - **Output Dimensionality:** 1024 dimensions
 - **Similarity Function:** Cosine Similarity
 <!-- - **Training Dataset:** Unknown -->
-<!-- - **Language:** Unknown -->
-<!-- - **License:** Unknown -->
-### Model Sources
-- **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
-- **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
-- **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
 ### Full Model Architecture
@@ -52,62 +46,41 @@ pip install -U sentence-transformers
 Then you can load this model and run inference.
 ```python
 from sentence_transformers import SentenceTransformer
-# Download from the 🤗 Hub
-model = SentenceTransformer("sentence_transformers_model_id")
-# Run inference
-sentences = [
-    'The weather is lovely today.',
-    "It's so sunny outside!",
-    'He drove to the stadium.',
 ]
-embeddings = model.encode(sentences)
-print(embeddings.shape)
-# [3, 1024]
-# Get the similarity scores for the embeddings
-similarities = model.similarity(embeddings, embeddings)
-print(similarities.shape)
-# [3, 3]
-```
-<!--
-### Direct Usage (Transformers)
-<details><summary>Click to see the direct usage in Transformers</summary>
-</details>
--->
-<!--
-### Downstream Usage (Sentence Transformers)
-You can finetune this model on your own dataset.
-<details><summary>Click to expand</summary>
-</details>
--->
-<!--
-### Out-of-Scope Use
-*List how the model may foreseeably be misused and address what users ought not to do with the model.*
--->
-<!--
-## Bias, Risks and Limitations
-*What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
--->
-<!--
-### Recommendations
-*What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
--->
-## Training Details
 ### Framework Versions
 - Python: 3.10.16
@@ -118,24 +91,7 @@ You can finetune this model on your own dataset.
 - Datasets: 2.21.0
 - Tokenizers: 0.21.1
-## Citation
-### BibTeX
-<!--
-## Glossary
-*Clearly define terms in order to be accessible across audiences.*
--->
-<!--
-## Model Card Authors
-*Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
--->
-<!--
-## Model Card Contact
-*Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
--->

 - sentence-transformers
 - sentence-similarity
 - feature-extraction
+- telepix
 pipeline_tag: sentence-similarity
 library_name: sentence-transformers
 ---
+# PIXIE-Rune
+An encoder-based embedding model trained on Korean and English triplets, developed by [TelePIX Co., Ltd](https://telepix.net/).
 ## Model Details
 - **Output Dimensionality:** 1024 dimensions
 - **Similarity Function:** Cosine Similarity
 <!-- - **Training Dataset:** Unknown -->
+- **Language:** Multilingual — optimized for high performance in Korean and English
+- **License:** apache-2.0
 ### Full Model Architecture
 Then you can load this model and run inference.
 ```python
 from sentence_transformers import SentenceTransformer
+# Load the model
+model_name = 'PIXIE-Rune-M-v1.0'
+model = SentenceTransformer(model_name)
+# Define the queries and documents
+queries = [
+    "텔레픽스는 어떤 산업 분야에서 위성 데이터를 활용하나요?",
+    "국방 분야에 어떤 위성 서비스가 제공되나요?",
+    "텔레픽스의 기술 수준은 어느 정도인가요?",
+]
+documents = [
+    "텔레픽스는 국방, 농업, 자원, 해양 등 다양한 분야에서 위성 데이터를 분석하여 서비스를 제공합니다.",
+    "정찰 및 감시 목적의 위성 영상을 통해 국방 관련 정밀 분석 서비스를 제공합니다.",
+    "TelePIX의 광학 탑재체 및 AI 분석 기술은 Global standard를 상회하는 수준으로 평가받고 있습니다.",
+    "텔레픽스는 우주에서 수집한 정보를 분석하여 '우주 경제(Space Economy)'라는 새로운 가치를 창출하고 있습니다.",
+    "텔레픽스는 위성 영상 획득부터 분석, 서비스 제공까지 전 주기를 아우르는 솔루션을 제공합니다.",
 ]
+# Compute embeddings: use `prompt_name="query"` to encode queries!
+query_embeddings = model.encode(queries, prompt_name="query")
+document_embeddings = model.encode(documents)
+# Compute cosine similarity scores
+scores = model.similarity(query_embeddings, document_embeddings)
+# Output the results
+for query, query_scores in zip(queries, scores):
+    doc_score_pairs = list(zip(documents, query_scores))
+    doc_score_pairs = sorted(doc_score_pairs, key=lambda x: x[1], reverse=True)
+    print("Query:", query)
+    for document, score in doc_score_pairs:
+        print(score, document)
+```
 ### Framework Versions
 - Python: 3.10.16
 - Datasets: 2.21.0
 - Tokenizers: 0.21.1
+## Contact
+If you have any suggestions or questions about this Model, please reach out to the authors at bmkim@telepix.net.