| --- |
| license: apache-2.0 |
| language: |
| - en |
| pipeline_tag: text-classification |
| --- |
| # Model Card for Model ID |
|
|
| <!-- Based on https://huggingface.co/t5-small, model generates SQL from text given table list with "CREATE TABLE" statements. |
| This is a very light weigh model and could be used in multiple analytical applications. --> |
|
|
| Based on [google/mobilebert-uncased](https://huggingface.co/google/mobilebert-uncased) (MobileBERT is a thin version of BERT_LARGE, while equipped with bottleneck structures and a carefully designed balance between self-attentions and feed-forward networks). This model detects SQLInjection attacks in the input string (check How To Below). This is a very very light model (100mb) and can be used for edge computing use cases. Used dataset from [Kaggle](www.kaggle.com) called [SQl_Injection](https://www.kaggle.com/datasets/sajid576/sql-injection-dataset). |
| **Please test the model before deploying into any environment**. |
| Contact us for more info: support@cloudsummary.com |
| ### Code Repo |
| Here is the code repo https://github.com/cssupport23/AI-Model---SQL-Injection-Attack-Detector |
| |
| ## Model Details |
| |
| ### Model Description |
| |
| <!-- Provide a longer summary of what this model is. --> |
| |
| |
| |
| - **Developed by:** cssupport (support@cloudsummary.com) |
| - **Model type:** Language model |
| - **Language(s) (NLP):** English |
| - **License:** Apache 2.0 |
| - **Finetuned from model :** [google/mobilebert-uncased](https://huggingface.co/google/mobilebert-uncased) |
| |
| ### Model Sources |
| |
| <!-- Provide the basic links for the model. --> |
| |
| Please refer [google/mobilebert-uncased](https://huggingface.co/google/mobilebert-uncased) for Model Sources. |
| |
| ## How to Get Started with the Model |
| |
| Use the code below to get started with the model. |
| |
| ```python |
| import torch |
| from transformers import MobileBertTokenizer, MobileBertForSequenceClassification |
| |
| |
| device = torch.device('cuda' if torch.cuda.is_available() else 'cpu') |
| tokenizer = MobileBertTokenizer.from_pretrained('google/mobilebert-uncased') |
| model = MobileBertForSequenceClassification.from_pretrained('cssupport/mobilebert-sql-injection-detect') |
| model.to(device) |
| model.eval() |
|
|
| def predict(text): |
| inputs = tokenizer(text, padding=False, truncation=True, return_tensors='pt', max_length=512) |
| input_ids = inputs['input_ids'].to(device) |
| attention_mask = inputs['attention_mask'].to(device) |
| |
| with torch.no_grad(): |
| outputs = model(input_ids=input_ids, attention_mask=attention_mask) |
| |
| logits = outputs.logits |
| probabilities = torch.softmax(logits, dim=1) |
| predicted_class = torch.argmax(probabilities, dim=1).item() |
| return predicted_class, probabilities[0][predicted_class].item() |
| |
|
|
| #text = "SELECT * FROM users WHERE username = 'admin' AND password = 'password';" |
| #text = "select * from users where username = 'admin' and password = 'password';" |
| #text = "SELECT * from USERS where id = '1' or @ @1 = 1 union select 1,version ( ) -- 1'" |
| #text = "select * from data where id = '1' or @" |
| text ="select * from users where id = 1 or 1#\"? = 1 or 1 = 1 -- 1" |
| predicted_class, confidence = predict(text) |
| |
| if predicted_class > 0.7: |
| print("Prediction: SQL Injection Detected") |
| else: |
| print("Prediction: No SQL Injection Detected") |
| |
| print(f"Confidence: {confidence:.2f}") |
| # OUTPUT |
| # Prediction: SQL Injection Detected |
| # Confidence: 1.00 |
| ``` |
| |
|
|
| ## Uses |
|
|
| <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. --> |
|
|
| [More Information Needed] |
|
|
| ### Direct Use |
|
|
| <!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. --> |
| Could used in application where natural language is to be converted into SQL queries. |
| [More Information Needed] |
|
|
|
|
|
|
| ### Out-of-Scope Use |
|
|
| <!-- This section addresses misuse, malicious use, and uses that the model will not work well for. --> |
|
|
| [More Information Needed] |
|
|
| ## Bias, Risks, and Limitations |
|
|
| <!-- This section is meant to convey both technical and sociotechnical limitations. --> |
|
|
| [More Information Needed] |
|
|
| ### Recommendations |
|
|
| <!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. --> |
|
|
| Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. |
|
|
|
|
|
|
| ## Technical Specifications |
|
|
| ### Model Architecture and Objective |
|
|
| [google/mobilebert-uncased](https://huggingface.co/google/mobilebert-uncased) |
|
|
| ### Compute Infrastructure |
|
|
|
|
|
|
| #### Hardware |
|
|
| one P6000 GPU |
|
|
| #### Software |
|
|
| Pytorch and HuggingFace |