turkish-organization-ner-dataset
This model is a fine-tuned version of dbmdz/bert-base-turkish-128k-uncased on the yeniguno/turkish-organization-ner-dataset dataset.
Unlike general NER models, it is trained only for organization detection (ORG).
The labels are:
O(outside),B-ORG(beginning of organization),I-ORG(inside organization).
Model description
NER (but only ORG)
lightweight NER focused on org detection
How to use it
You can load the model directly with the 🤗 pipeline API for NER:
from transformers import pipeline
model_id = "yeniguno/bert-turkish-organization-ner"
ner = pipeline("ner", model=model_id, aggregation_strategy="simple")
text = "Microsoft ve Koç Holding birlikte bir proje başlattı."
print(ner(text))
"""
[{'entity_group': 'ORG', 'score': np.float32(0.99849355), 'word': 'microsoft', 'start': 0, 'end': 9},
{'entity_group': 'ORG', 'score': np.float32(0.9970416), 'word': 'koc holding', 'start': 13, 'end': 24}]
"""
Intended uses & limitations
- Guardrails in LLM applications: detect and flag organization names in user prompts or model outputs.
- Content filtering & compliance: e.g. anonymization, redaction, or entity-specific monitoring.
- Analytics: extracting organization mentions from Turkish text for search, clustering, or knowledge graphs.
Training and evaluation data
It achieves the following results on the evaluation set:
- Loss: 0.1152
- F1: 0.9159
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 16
- eval_batch_size: 32
- seed: 42
- optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- lr_scheduler_warmup_ratio: 0.1
- num_epochs: 5
Training results
| Training Loss | Epoch | Step | Validation Loss | F1 |
|---|---|---|---|---|
| 0.0617 | 1.0 | 8080 | 0.0679 | 0.8990 |
| 0.0471 | 2.0 | 16160 | 0.0640 | 0.9105 |
| 0.0295 | 3.0 | 24240 | 0.0846 | 0.9110 |
| 0.0277 | 4.0 | 32320 | 0.0959 | 0.9153 |
| 0.0116 | 5.0 | 40400 | 0.1152 | 0.9159 |
Framework versions
- Transformers 4.55.2
- Pytorch 2.9.0.dev20250816+cu128
- Datasets 4.0.0
- Tokenizers 0.21.4
- Downloads last month
- 5
Model tree for yeniguno/bert-turkish-organization-ner-uncased
Base model
dbmdz/bert-base-turkish-128k-uncased