SQL Workshop
What is it
SQL (Structured Query Language) is a querying language, plain and simple. If you've ever used formulas in Excel or Google Sheets, you'll feel some familiarity here. The reason why SQL exists is at some point the spreadsheet gets too large that it slows down, has too many sheets, or the formulas get too complex to maintain.
Why learn it
SQL has a place in every industry:
- software development
- business intelligence
- marketing analysis
- algorithmic trading
- data visualization
- etc
It is undoubtedly one of the most valuable skills you can have.
What does it look like
First, let's contrast it to what we may already know about spreadsheets
Excel Spreadsheet => SQL Database
Sheet => Table
The jargon is not too bad. Let's take a look at this overgrown spreadsheet, made up of many sheets.
We have a perfect representation of the database we'll be playing with in the right panel, (or open in a new window )
Querying
The most basic query is to just list everything from a single table. Let's look
at the albums table.
- First, we always
SELECTthe fields that we want:SELECT albums.Title - The we always write
FROMwhere the data is coming:FROM albums
These are the absolutely necessary part of every query
-- You can write comments like this
SELECT albums.Title
FROM albums; -- End your query with a semicolon
ORDER BY
So maybe not too impressive. Let's try some magic tricks. Maybe we can start by
ORDERing them BY title.
SELECT albums.Title
FROM albums
ORDER BY albums.Title;
This will order by ascending order (small to big)
You can also flip the order by using DESC (big to small)
SELECT albums.Title
FROM albums
ORDER BY albums.Title DESC;
WHERE
And we only want the ones that start with Q
SELECT albums.Title
FROM albums
WHERE albums.Title LIKE 'Q%'
ORDER BY albums.Title;
Notice the order of the WHERE and the ORDER BY statements. The order of these
statements are the first "gotcha". A quick cheatsheet for the order of the most
popular statements is
SELECTFROMJOINWHEREGROUP BYORDER BYLIMIT
JOIN
JOIN is one of the most powerful features of a relational database. In Excel,
you might have used something similar like VLOOKUP, but SQL JOINs are much
more powerful.
If you like puzzles, you'll love joining. It's all about finding a column on
one table that matches values on another table. For instance, look up at the
artists table, and notice it has a ArtistId column. This is called the
Primary Key, meaning that every artist can be associated with this id, and
there are no duplicates.
Now look at the albums table. It also has an ArtistId column. Each one of these
numbers are referring to an id on the artists table. This is called a Foreign Key.
You can have duplicates in this column because multiple albums can belong to the
same artist. Let's join these tables together and view the artist for every album!
SELECT artists.Name, albums.Title
FROM artists
JOIN albums ON artists.ArtistId = albums.ArtistId;
Let's join tracks on albums!
SELECT albums.Title, tracks.Name
FROM albums
JOIN tracks ON tracks.AlbumId = albums.AlbumId;
Let's join artists, albums, and tracks!
SELECT artists.Name, albums.Title, tracks.Name
FROM artists
JOIN albums ON artists.ArtistId = albums.ArtistId
JOIN tracks ON tracks.AlbumId = albums.AlbumId;
Here is a map of our database to help solve our "mazes"

GROUP BY
That's a real nice query, but that's a lot of duplicate data, and what if I want
to know how many albums each artist had? Or how many tracks each album had? We
can run another query to GROUP BY the columns with duplicates together, and
count how many got squeezed.
SELECT artists.Name, albums.Title
FROM artists
JOIN albums ON artists.ArtistId = albums.ArtistId
GROUP BY artists.Name;
So this squeezed the artists.Names together so there are now no duplicate, but
it just gives the first album, not very helpful. What we can't see here is that
all the albums are squished together also, even though it is only showing the first.
We can count those albums using the COUNT function
SELECT artists.Name, COUNT(albums.Title)
FROM artists
JOIN albums ON artists.ArtistId = albums.ArtistId
GROUP BY artists.Name;
So now let's ORDER BY the counts to see who has the most albums!
SELECT artists.Name, COUNT(albums.Title)
FROM artists
JOIN albums ON artists.ArtistId = albums.ArtistId
GROUP BY artists.Name
ORDER BY COUNT(albums.Title) DESC;
LIMIT
LIMIT is pretty straightforward. It returns only the number of rows you need.
Let's return the top 10 artists with the most albums.
SELECT artists.Name, COUNT(albums.Title)
FROM artists
JOIN albums ON artists.ArtistId = albums.ArtistId
GROUP BY artists.Name
ORDER BY COUNT(albums.Title) DESC
LIMIT 10;
Pivot Tables
You'll notice some tables looking a little slim on details, such as the playlist_track
table. This is called a Pivot table, or Join table. It's only purpose is to
join many rows from one table to many rows of another table. This creates a
many-to-many relationship, meaning many tracks can belong to many playlist, or
a track can appear on more than one playlist and a playlist can have more than one
track. You join them just the same, and can be very useful when trying to find a link
between multiple tables, like a maze.
SELECT playlists.name, tracks.Name
FROM playlists
JOIN playlist_track ON playlists.PlaylistId = playlist_track.PlaylistId
JOIN tracks ON tracks.TrackId = playlist_track.TrackId;
Challenges
- Which albums have the most tracks?
Which albums have the most track time?
- HINT: You can use the
SUM()function ontracks.milliseconds
- HINT: You can use the
Which
artistshave the most track time?- Which
playlistshave the mosttracks? - Which
tracksappears on the mostplaylists? - Which
playlistshave the most track time?
© 2019 Austin Coding Academy