Thursday, January 12, 2017

A Better way to write SQL queries

Writing code is art and same applies to SQL queries. The way you structure your query, the way you write it goes a long way to communicate your intent to the fellow developer. When I see SQL queries on emails from multiple developers, I can see the stark difference in their writing style. Some developers write it so neatly and indent their query properly, which makes it easy to spot the key details e.g. which columns you are extracting from which table and what are conditions. Since in real life projects, SQL queries are hardly one-liner, learning the right way to write SQL query makes a lot of difference when you read it yourself later or you share that query to someone for review or use.


In this article, I am going to show you a couple of styles which I have tried in past, their pros and cons and what I think is the best way to write SQL query. Unless you have a good reason not to use my style e.g. you have a better style or you want to stick with the style used in your project (consistency overrules everything) there is no reason not to use it.




1st way to write SQL query

SELECT e.emp_id, e.emp_name, d.dept_name, p.project_name from Employee e 
INNER JOIN Department d ON e.dept_id = d.dept_id INNER JOIN Projects p 
ON e.project_id = p.project_id Where d.dept_name="finance" and e.emp_name 
like '%A%' and e.salary > 5000;


Pros:
1) mixed case was introduced to separate keyword from column and table names e.g. writing SELECT in capital case and writing Employee in as it is, but given you are not consistent e.g. SELECT is in caps but from is in small, there is no benefit of using that style.

Problem:
1) mixed case
2) all query is written on one line which gets unreadable as soon number of tables and columns increases
3) no flexibility on adding a new condition or running without existing condition



2nd way to write SQL query

SELECT e.emp_id, e.emp_name, d.dept_name, p.project_name
from Employee e
INNER JOIN Department d ON e.dept_id = d.dept_id
INNER JOIN Projects p ON e.project_id = p.project_id
Where d.dept_name="finance" and e.emp_name like '%A%' and e.salary > 500;

Improvement:
1) query is divided into multiple lines which make it more readable

Problems
1) mixed case
2) all conditions on where clause are on the same line, which means excluding them by commenting is not that easy.

A Better way to write SQL queries


3rd way to write SQL query

select e.emp_id, e.emp_name, d.dept_name
from Employee e
inner join Department d on e.dept_id = d.dept_id
where d.dept_name = 'finance'
and e.emp_name like '%A%'
and e.salary > 500;

1) dividing SQL queries into multiple lines makes it more readable
2) using proper indentation makes it easy to spot the source of data i.e. tables and joins
3) having conditions on separate lines allow you to run the query by commenting one of the condition e.g.

select e.emp_id, e.emp_name, d.dept_name
from Employee e
inner join Department d on e.dept_id = d.dept_id
where d.dept_name = 'finance'
-- and e.emp_name like '%A%';
add e.salary > 5000


That's all about how to write SQL query which is readable and more maintainable. Feel free to give your opinion on what do you think of this indentation or styling of SQL queries. It's a simpler technique but very powerful and goes a long way on improving the readability of your complex SQL queries.


Further Learning
Introduction to SQL by Jon Flanders
Introduction to SQL Server
Head First SQL

Thanks for reading this article and let me know how do you write SQL queries? which style you use, or you have your own style?

8 comments :

Badal Chowdhary said...

Nice post. 1 additional input: make all the sql text lower case and keywords upper case. it can help distinguish between key words and column/table names.

In SQL Developer, can be easily achieved by:
1. Ctrl + Shift + ' (change case)
2. Ctrl + F7 (Format: keywords are changed to uppercase)

Mladen said...

I think upper case sql keywords are not bad. If there is no sql highlighting available (maybe the sql is embedded in another language or just an email) then it's much easier to read. I also prefer indentation in some cases.

Turnkey Solutions said...

You have barely scratched the surface of SQL at this point. What about subqueries?

Anonymous said...

I always put the new table name first in a join clause, not second. That way the join looks more consistent if you're joining to Table c with clauses correlating to Table a and Table b, and if you have a simple join constraint on c, e.g. "
LEFT JOIN c
ON c.foo = a.foo
AND c.bar = b.bar
AND c.baz LIKE 'A%'

Anonymous said...

If you want to run query by commenting conditions, you better write
where 1=1
and id > 100
and name like '%s%'

CSC Programmer said...

I agree with @Badal that you should make all the sql text lower case and keywords upper case to help distinguish between key words and table names. Nice tutorial.

rvt said...

I also think that the upper case for SQL keywords is not a bad thing at all.
The 'Department' and Employee with uppercase really bug's me here. That is camel case mixing with underscore_case.
Joining on e.emp_id = d.dept_id sounds very fishy, why would you want a employment id being joined on department id?? Or perhaps is your naming all goovy? Where project_name comes from totally baffles me... If you talk about good style, at least show this correctly on your website with correctly and working example SQL's (assuming the table exists).

For anybody looking for good books about SQL, Joe Celko's book simply rock!

Javin Paul said...

Hello @rvt, I agree with you. Actually some typo left, e.emp_id = d.dept_id should be read as e.dept_id = d.dept_id, a key column between these two table where dept_id is the primary key in Department but foreign key in Employee. Similarly, p.project_name come from Project table which represent the relationship between Employee and Project, again project_id is primary key in Project table and foreign key in Employee table, so if you need project name an employee is working on, you need to join Employee with Project.

Initially I wanted to show examples with multiple join conditions but later refactored to use just one join for simplicity, but looks like some typo left.

Thanks for pointing them out, I'll correct them.

Post a Comment