Update README.md

820b5cc verified over 1 year ago

4.62 kB

	---
	license: apache-2.0
	language:
	- en
	pipeline_tag: text-classification
	---
	# Model Card for Model ID

	<!-- Based on https://huggingface.co/t5-small, model generates SQL from text given table list with "CREATE TABLE" statements.
	This is a very light weigh model and could be used in multiple analytical applications. -->

	Based on [google/mobilebert-uncased](https://huggingface.co/google/mobilebert-uncased) (MobileBERT is a thin version of BERT_LARGE, while equipped with bottleneck structures and a carefully designed balance between self-attentions and feed-forward networks). This model detects SQLInjection attacks in the input string (check How To Below). This is a very very light model (100mb) and can be used for edge computing use cases. Used dataset from [Kaggle](www.kaggle.com) called [SQl_Injection](https://www.kaggle.com/datasets/sajid576/sql-injection-dataset).
	Please test the model before deploying into any environment.
	Contact us for more info: support@cloudsummary.com
	### Code Repo
	Here is the code repo https://github.com/cssupport23/AI-Model---SQL-Injection-Attack-Detector

	## Model Details

	### Model Description

	<!-- Provide a longer summary of what this model is. -->



	- Developed by: cssupport (support@cloudsummary.com)
	- Model type: Language model
	- Language(s) (NLP): English
	- License: Apache 2.0
	- Finetuned from model : [google/mobilebert-uncased](https://huggingface.co/google/mobilebert-uncased)

	### Model Sources

	<!-- Provide the basic links for the model. -->

	Please refer [google/mobilebert-uncased](https://huggingface.co/google/mobilebert-uncased) for Model Sources.

	## How to Get Started with the Model

	Use the code below to get started with the model.

	```python
	import torch
	from transformers import MobileBertTokenizer, MobileBertForSequenceClassification


	device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
	tokenizer = MobileBertTokenizer.from_pretrained('google/mobilebert-uncased')
	model = MobileBertForSequenceClassification.from_pretrained('cssupport/mobilebert-sql-injection-detect')
	model.to(device)
	model.eval()

	def predict(text):
	inputs = tokenizer(text, padding=False, truncation=True, return_tensors='pt', max_length=512)
	input_ids = inputs['input_ids'].to(device)
	attention_mask = inputs['attention_mask'].to(device)

	with torch.no_grad():
	outputs = model(input_ids=input_ids, attention_mask=attention_mask)

	logits = outputs.logits
	probabilities = torch.softmax(logits, dim=1)
	predicted_class = torch.argmax(probabilities, dim=1).item()
	return predicted_class, probabilities[0][predicted_class].item()


	#text = "SELECT * FROM users WHERE username = 'admin' AND password = 'password';"
	#text = "select * from users where username = 'admin' and password = 'password';"
	#text = "SELECT * from USERS where id = '1' or @ @1 = 1 union select 1,version ( ) -- 1'"
	#text = "select * from data where id = '1' or @"
	text ="select * from users where id = 1 or 1#\"? = 1 or 1 = 1 -- 1"
	predicted_class, confidence = predict(text)

	if predicted_class > 0.7:
	print("Prediction: SQL Injection Detected")
	else:
	print("Prediction: No SQL Injection Detected")

	print(f"Confidence: {confidence:.2f}")
	# OUTPUT
	# Prediction: SQL Injection Detected
	# Confidence: 1.00
	```


	## Uses

	<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->

	[More Information Needed]

	### Direct Use

	<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
	Could used in application where natural language is to be converted into SQL queries.
	[More Information Needed]



	### Out-of-Scope Use

	<!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->

	[More Information Needed]

	## Bias, Risks, and Limitations

	<!-- This section is meant to convey both technical and sociotechnical limitations. -->

	[More Information Needed]

	### Recommendations

	<!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->

	Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.



	## Technical Specifications

	### Model Architecture and Objective

	[google/mobilebert-uncased](https://huggingface.co/google/mobilebert-uncased)

	### Compute Infrastructure



	#### Hardware

	one P6000 GPU

	#### Software

	Pytorch and HuggingFace