Django is a Python web framework that supports rapid development. It already has many components and elements to help you quickly deploy your site or application. You can include front-end and back-end components, and then you’re ready for the next sprint. Additionally, one of Django’s great advantages is that it already includes many security features. But you still have to pay attention, making sure you use all its characteristics appropriately.
For instance, the topic at hand is SQL injection. Let’s dive in to it with a very quick overview, some examples of how it affects you, and how to prevent such attacks, specifically using the Django framework.
What Is SQL Injection?
In a few words, it’s an attack on your application, where the attacker attempts to execute additional commands on your database. It’s called SQL injection because the attacker injects SQL commands through user inputs, thus changing the way your application behaves. This may lead to information leaks, unauthorized access, or even wiping all your data.
Now let’s look at an example. Let’s assume you have an API to authenticate a user, and within your code, you execute the following query:
select * from users where username=’myuser’ and password =’secret’;
Say a user inputs the following string: secret’ or ‘1’=’1. Then we have this:
select * from users where username=’myuser’ and password =’secret’ or ‘1’=’1‘;
The above statement is a valid SQL statement. Nevertheless, since we added the condition or ‘1’=’1′, the statement will always evaluate as TRUE. As a result, we’re bypassing the password security.
This is just a basic example. For a deeper overview, check out this article that gives further examples of SQL injections and how to prevent them.
How Do You Mitigate a SQL Injection?
The answer here is very straightforward:
To avoid this attack, you have to sanitize every user input.
And, especially for web applications, this needs to be done at both the client and server side. Why? Because most current browsers already come with the tools an attacker needs to bypass client-side validations. And if you add a REST API client, you’re already speaking only to the server, so all browser security implementation is gone.
Sanitizing inputs may be cumbersome—more so if you have to do it twice. This is where development frameworks especially come in handy because somebody else already found a way to make this simpler. Now, let’s see how Django deals with this.
Preventing SQL injection With Django
Authentication
In reference to the previous example, Django already includes a library to authenticate your users. Look at this example, taken from Django documentation:
from django.contrib.auth import authenticate
myuser = request.POST['username']
mypassword= request.POST['password']
user = authenticate(username=myuser , password=mypassword)
if user is not None:
# A backend authenticated the credentials
else:
# No backend authenticated the credentials
The above code pulls the username and password from the POST request (user input through a form). Then, the authenticate function takes care of authenticating user credentials against an authentication back end, which, in Django, can be a database or an LDAP system, for example.
Database Queries
Even though you may use direct queries in Django, you should avoid them by using Django’s Object Relational Mapping (ORM) layer. Within that layer, Django protects itself from SQL injection by using query parameterization. Within the ORM layer, Django defines SQL queries separated from the query’s parameters, and the database driver is in charge of escaping each of the parameters.
Let’s look at another example, this time presenting a Django model and how Django works with the information.
Django Model
class Blog(models.Model):
name = models.CharField(max_length=100)
tagline = models.TextField()
def __str__(self):
return self.name
This is a very simple object with two fields: name and tagline. When deployed, it will map a database table with the same fields.
Working With Data
>>> from blog.models import Blog
>>> b = Blog(name="My pet's blog", tagline='Adventures in the animal house.')
>>> b.save()
In the above example, we created a blog object, setting its initial values. Then we save it to the database, a very simple operation.
>>> b.name = 'New name'
>>> b.save()
Now we changed the name and saved it again to the database.
>>> Blog.objects.get(name__iexact="My pet's blog")
And the above command makes a query to the table, looking for the specific text.
Take a moment to look at the input text. The point with the previous examples is that even if you’re using special characters, such as the apostrophe (‘), the SQL won’t break, and it’ll use the text the way it’s meant to be used.
Using Custom Queries
Sometimes, there are situations where queries are very complex or the data doesn’t adjust to the defined models, so it’s not possible to use the ORM layer. For these scenarios, Django provides you different options to execute your SQL statements in a secure way.
Manager.raw()
This method allows you to execute an arbitrary query that returns model instances. Let’s review this example.
class Person(models.Model):
first_name = models.CharField(...)
last_name = models.CharField(...)
birth_date = models.DateField(...)
The above model represents a database table with three fields, so we can execute the following queries:
>>> query = 'SELECT * FROM myapp_person WHERE last_name = %s' **% lname**
>>> Person.objects.raw(query)
or
>>> query = "SELECT * FROM myapp_person WHERE last_name = **'%s'**"
Both queries will execute correctly, but they contain errors. The first query is using Python string formatting, and the second one is quoting the placeholder. These are two common errors that you must avoid. The raw method has a parameter called params. You must use that parameter to pass a parameter list or a dictionary, and use %s (or %(key)s ) without quoting as a parameter placeholder. That way, the database driver will quote each parameter correctly. Below is the correct way to use the raw method.
>>> lname = 'Doe'
>>> Person.objects.raw('SELECT * FROM myapp_person WHERE last_name = %s', [lname])
The raw method receives a query and list of parameters. It’s important to note that %s represents the placeholder for each parameter in the list.
Direct SQL
There are times when the raw method isn’t enough, and you have to execute queries directly. You may use the object django.db.connection to speak directly to the database. Take this example:
from django.db import connection
def my_custom_sql(self):
with connection.cursor() as cursor:
cursor.execute("UPDATE bar SET foo = 1 WHERE baz = %s", [self.baz])
cursor.execute("SELECT foo FROM bar WHERE baz = %s", [self.baz])
row = cursor.fetchone()
return row
Once again, to protect against SQL injection, use parameters and don’t include quotes around the %s placeholders.
RawSQL
The last way to express complex WHERE clauses is using the RawSQL expression. For example, take a look at the below:
>>> queryset.annotate(val=RawSQL("select col from sometable where othercol = %s", (someparam,)))
As in the previous section, the same parameter and placeholder rules apply. You must use a list to pass parameters to your query, don’t use text formatting, and don’t quote the placeholder.
If you don’t follow this recommendation, your query will be unsafe, and an SQL injection attack will most likely succeed.
Summary
In this post, we looked at a brief SQL injection definition. As mentioned, to learn more, take a look at the article linked above.
As you read, you may have noticed the attack pattern. It’s important to note once more how to mitigate it: you must always sanitize each and every user input. If you fail in doing so, your data and your application’s security may be in serious danger.
Finally, Django is a very powerful framework, and it can help you develop your application fast while also keeping it secure. And if you don’t use Django, at least implement the parameterized query practice. It may save you one day.
This post was written by Juan Pablo Macias Gonzalez. Juan is a computer systems engineer with experience in backend, frontend, databases and systems administration.