Django is a Python web framework that supports rapid development. It already has many components and elements to help you quickly deploy your site or application. You can include front-end and back-end components, and then youโre ready for the next sprint. Additionally, one of Djangoโs great advantages is that it already includes many security features. But you still have to pay attention, making sure you use all its characteristics appropriately.
For instance, the topic at hand is SQL injection. Letโs dive in to it with a very quick overview, some examples of how it affects you, and how to prevent such attacks, specifically using the Django framework.
What Is SQL Injection?
In a few words, itโs an attack on your application, where the attacker attempts to execute additional commands on your database. Itโs called SQL injection because the attacker injects SQL commands through user inputs, thus changing the way your application behaves. This may lead to information leaks, unauthorized access, or even wiping all your data.
Now letโs look at an example. Letโs assume you have an API to authenticate a user, and within your code, you execute the following query:
select * from users where username=โmyuserโ and password =โsecretโ;
Say a user inputs the following string: secretโ or โ1โ=โ1 . Then we have this:
select * from users where username=โmyuserโ and password =โ secretโ or โ1โ=โ1 โ;
The above statement is a valid SQL statement. Nevertheless, since we added the condition or โ1โ=โ1โฒ , the statement will always evaluate as TRUE. As a result, weโre bypassing the password security.
This is just a basic example. For a deeper overview, check out this article that gives further examples of SQL injections and how to prevent them.
How Do You Mitigate a SQL Injection?
The answer here is very straightforward:
To avoid this attack, you have to sanitize every user input.
And, especially for web applications, this needs to be done at both the client and server side. Why? Because most current browsers already come with the tools an attacker needs to bypass client-side validations. And if you add a REST API client, youโre already speaking only to the server, so all browser security implementation is gone.
Sanitizing inputs may be cumbersomeโmore so if you have to do it twice. This is where development frameworks especiallycome in handy because somebody else already found a way to make this simpler. Now, letโs see how Django deals with this.
Preventing SQL injection With Django
Authentication
In reference to the previous example, Django already includes a library to authenticate your users. Look at this example, taken from Django documentation :
from django.contrib.auth import authenticate
myuser = request.POST['username']
mypassword= request.POST['password']
user = authenticate(username=myuser , password=mypassword)
if user is not None:
# A backend authenticated the credentials
else:
# No backend authenticated the credentials
The above code pulls the username and password from the POST request (user input through a form). Then, the authenticate function takes care of authenticating user credentials against an authentication back end, which, in Django, can be a database or an LDAP system, for example.
Database Queries
Even though you may use direct queries in Django, you should avoid them by using Djangoโs Object Relational Mapping (ORM) layer. Within that layer, Django protects itself from SQL injection by using query parameterization . Within the ORM layer, Django defines SQL queries separated from the queryโs parameters, and the database driver is in charge of escaping each of the parameters.
Letโs look at another example, this time presenting a Django model and how Django works with the information.
Django Model
class Blog(models.Model):
name = models.CharField(max_length=100)
tagline = models.TextField()
def __str__(self):
return self.name
This is a very simple object with two fields: name and tagline. When deployed, it will map a database table with the same fields.
Working With Data
>>> from blog.models import Blog
>>> b = Blog(name="My pet's blog", tagline='Adventures in the animal house.')
>>> b.save()
In the above example, we created a blog object, setting its initial values. Then we save it to the database, a very simple operation.
>>> b.name = 'New name'
>>> b.save()
Now we changed the name and saved it again to the database.
>>> Blog.objects.get(name__iexact="My pet's blog")
And the above command makes a query to the table, looking for the specific text.
Take a moment to look at the input text. The point with the previous examples is that even if youโre using special characters, such as the apostrophe (โ), the SQL wonโt break, and itโll use the text the way itโs meant to be used.
Using Custom Queries
Sometimes, there are situations where queries are very complex or the data doesnโt adjust to the defined models, so itโs not possible to use the ORM layer. For these scenarios, Django provides you different options to execute your SQL statements in a secure way.
Manager.raw()
This method allows you to execute an arbitrary query that returns model instances. Letโs review this example.
class Person(models.Model):
first_name = models.CharField(...)
last_name = models.CharField(...)
birth_date = models.DateField(...)
The above model represents a database table with three fields, so we can execute the following queries:
>>> query = 'SELECT * FROM myapp_person WHERE last_name = %s' **% lname**
>>> Person.objects.raw(query)
or
>>> query = "SELECT * FROM myapp_person WHERE last_name = **'%s'**"
Both queries will execute correctly, but they contain errors. The first query is using Python string formatting, and the second one is quoting the placeholder. These are two common errors that you must avoid . The raw method has a parameter called params . You must use that parameter to pass a parameter list or a dictionary, and use %s (or %(key)s ) without quoting as a parameter placeholder. That way, the database driver will quote each parameter correctly. Below is the correct way to use the raw method.
>>> lname = 'Doe'
>>> Person.objects.raw('SELECT * FROM myapp_person WHERE last_name = %s', [lname])
The raw method receives a query and list of parameters. Itโs important to note that %s represents the placeholder for each parameter in the list.
Direct SQL
There are times when the raw method isnโt enough, and you have to execute queries directly. You may use the object django.db.connection to speak directly to the database. Take this example :
from django.db import connection
def my_custom_sql(self):
with connection.cursor() as cursor:
cursor.execute("UPDATE bar SET foo = 1 WHERE baz = %s", [self.baz])
cursor.execute("SELECT foo FROM bar WHERE baz = %s", [self.baz])
row = cursor.fetchone()
return row
Once again, to protect against SQL injection, use parameters and donโt include quotes around the %s placeholders.
RawSQL
The last way to express complex WHERE clauses is using the RawSQL expression. For example, take a look at the below:
>>> queryset.annotate(val=RawSQL("select col from sometable where othercol = %s", (someparam,)))
As in the previous section, the same parameter and placeholder rules apply. You must use a list to pass parameters to your query, donโt use text formatting, and donโt quote the placeholder.
If you donโt follow this recommendation , your query will be unsafe, and an SQL injection attack will most likely succeed.
Summary
In this post, we looked at a brief SQL injection definition. As mentioned, to learn more, take a look at the article linked above.
As you read, you may have noticed the attack pattern. Itโs important to note once more how to mitigate it: you must always sanitize each and every user input. If you fail in doing so, your data and your applicationโs security may be in serious danger.
Finally, Django is a very powerful framework, and it can help you develop your application fast while also keeping it secure. And if you donโt use Django, at least implement the parameterized query practice. It may save you one day.
This post was written by Juan Pablo Macias Gonzalez. Juan is a computer systems engineer with experience in backend, frontend, databases and systems administration.