CS代考 BETWEEN 10 AND 20;

Live Coding Wk7 – Introduction to SQL¶

For this demo we will be exploring how to work with SQLite Databases. We’ll also be going through various SQL statements to query data from the SQLite Database.

Music.db SQLite Sample Database¶
We provide you with the SQLite sample database named Music. The Music sample database is a good database for practicing with SQL, especially SQLite.

The following database diagram illustrates the chinook database tables and their relationships.

There are 11 tables in the chinook sample database. Here are the table descriptors of the data:

Table Description
employees Stores employees data such as employee id, last name, first name, etc.
customers Stores customers data.
invoices Stores invoice header data
invoice_items Stores the invoice line items data
artists Stores artist data
albums Stores data about a list of tracks
media_types Stores media types such as MPEG audio and AAC audio files
genres Stores music types such as rock, jazz, metal, etc
tracks Stores the data of songs. Each track belongs to one album.
playlists Stores data about playlists. Each playlist contains a list of tracks. Each track may belong to multiple playlists.
playlist_track Used to reflect the relationship between the playlists table and tracks table

# Code Imports
import numpy as np
import pandas as pd
import sqlite3
from sqlite3 import Error

Connecting To Our Database¶
The first step is to connect to the database, below we have provided some boilerplate code to handle the database connection. Do not modify it, otherwise things will break. (If the database does not exist, the connection operation will create it for us)

# Our code here, do NOT modify
def create_connection(db_file):
“”” Connect to the specified SQLite database, if not exist, create a new one;
:db_file: location of db to connect to
:return: Connection object or None
conn = None
conn = sqlite3.connect(db_file)
print(“Connection established!”)
except Error as e:
return conn

db_file_loc = “data/music.db”
conn = create_connection(db_file_loc) # remember to close the connection when everything done

Connection established!

How do I extract information out of the database just created?¶
It’s really simple! Since you already have a Connection, you can create a Cursor object and call its execute() method to perform whatever SQL commands you like.

See the below example which query all records in table tasks.

cur = conn.cursor()
cur.execute(“SELECT * FROM tracks;”)
rows = cur.fetchall()
for row in rows[0:5]: #printing the first 5 rows only
print(row)

(1, ‘For Those About To Rock (We Salute You)’, 1, 1, 1, ‘ , , ‘, 343719, 11170334, 0.99)
(2, ‘Balls to the Wall’, 2, 2, 1, None, 342562, 5510424, 0.99)
(3, ‘Fast As a Shark’, 3, 2, 1, ‘F. Baltes, S. Kaufman, U. Dirkscneider & W. Hoffman’, 230619, 3990994, 0.99)
(4, ‘Restless and Wild’, 3, 2, 1, ‘F. Baltes, R.A. Smith-Diesel, S. Kaufman, U. Dirkscneider & W. Hoffman’, 252051, 4331779, 0.99)
(5, ‘Princess of the Dawn’, 3, 2, 1, ‘Deaffy & R.A. Smith-Diesel’, 375418, 6290521, 0.99)

# simpler to use the pandas read_sql_query(), which returns a dataframe
df_tracks = pd.read_sql_query(“select * from tracks;”, conn)

TrackId Name AlbumId MediaTypeId GenreId Composer Milliseconds Bytes UnitPrice
0 1 For Those About To Rock (We Salute You) 1 1 1 , , 343719 11170334 0.99
1 2 Balls to the Wall 2 2 1 None 342562 5510424 0.99
2 3 Fast As a Shark 3 2 1 F. Baltes, S. Kaufman, U. Dirkscneider & W. Ho… 230619 3990994 0.99
3 4 Restless and Wild 3 2 1 F. Baltes, R.A. Smith-Diesel, S. Kaufman, U. D… 252051 4331779 0.99
4 5 Princess of the Dawn 3 2 1 Deaffy & R.A. Smith-Diesel 375418 6290521 0.99
… … … … … … … … … …
3498 3499 Pini Di Roma ( Rom) \ I Pini … 343 2 24 None 286741 4718950 0.99
3499 3500 String Quartet No. 12 in C Minor, D. 703 “Quar… 344 2 24 139200 2283131 0.99
3500 3501 L’orfeo, Act 3, Sinfonia (Orchestra) 345 2 24 66639 1189062 0.99
3501 3502 Quintet for Horn, Violin, 2 Violas, and Cello … 346 2 24 Mozart 221331 3665114 0.99
3502 3503 Koyaanisqatsi 347 2 10 206005 3305164 0.99

3503 rows × 9 columns

How do I list all tables contained in an SQLite database?¶
https://www.sqlite.org/faq.html

query = “SELECT name FROM sqlite_schema WHERE type = \’table\’;”
pd.read_sql_query(query, conn)

1 sqlite_sequence
3 customers
4 employees
6 invoices
7 invoice_items
8 media_types
9 playlists
10 playlist_track
12 sqlite_stat1

Querying Data From Our Database¶

1. The SELECT Statement¶
We often use the SELECT statement to query data from one or more table. The syntax of the SELECT statement is as follows:

SELECT column_list
FROM table;

Even though the SELECT clause appears before the FROM clause, SQLite evaluates the FROM clause first and then the SELECT clause, therefore:

First, specify the table where you want to get data from in the FROM clause. Note that you can have more than one table in the FROM clause.
Second, specify a column or a list of comma-separated columns in the SELECT clause.

You use the semicolon (;) to terminate the statement.

Try getting data from the tracks table such as trackid, track name, composer, and unit price and display the first 5 entries in a Pandas Dataframe.

Hint: You need specify a list column names, which you want to get data, in the SELECT clause and the tracks table in the FROM clause.

## For You Todo
tracks_query = “”””””
df_tracks = pd.read_sql_query(tracks_query, conn)
df_tracks.set_index(‘TrackId’, inplace=True)
df_tracks.head()

For a table with many columns, the query would be so long that time-consuming to type. To avoid this, you can use the asterisk (*), which is the shorthand for all columns of the table as follows:

SELECT * FROM table;

Try getting data all the columns from the tracks table and display the first 5 entries in a Pandas Dataframe.

## For You Todo

2. The ORDER BY Clause¶
SQLite stores data in the tables in an unspecified order. It means that the rows in the table may or may not be in the order that they were inserted.

If you use the SELECT statement to query data from a table, the order of rows in the result set is unspecified.

To sort the result set, you add the ORDER BY clause to the SELECT statement as follows:

select_list
column_1 ASC,
column_2 DESC;

The ORDER BY clause comes after the FROM clause. It allows you to sort the result set based on one or more columns in ascending or descending order.

In this syntax, you place the column name by which you want to sort after the ORDER BY clause followed by the ASC or DESC keyword.

The ASC keyword means ascending.
And the DESC keyword means descending.

If you don’t specify the ASC or DESC keyword, SQLite sorts the result set using the ASC option. In other words, it sorts the result set in the ascending order by default.

Now that you know how to query data and order them, query the name, milliseconds and albumid columns from the tracks tablelist and order them by albumid in ascending order.

## For You Todo
tracks_ordered_query = “”””””
df_tracks_ordered = pd.read_sql_query(tracks_ordered_query, conn)
df_tracks_ordered

With the same query you just did, try ordering them by albumid in ascending order and milliseconds in descending order.

## For You Todo
tracks_ordered_query = “”””””
df_tracks_ordered = pd.read_sql_query(tracks_ordered_query, conn)
df_tracks_ordered

SQLite sorts rows by AlbumId column in ascending order first. Then, it sorts the sorted result set by the Milliseconds column in descending order.

If you look at the tracks of the album with AlbumId 1, you find that the order of tracks changes between the two statements.

3. The SELECT DISTINCT Clause¶
The DISTINCT clause is an optional clause of the SELECT statement. The DISTINCT clause allows you to remove the duplicate rows in the result set.

The following statement illustrates the syntax of the DISTINCT clause:

SELECT DISTINCT select_list
FROM table;

It is important to note that SQLite considers NULL values as duplicates. If you use the DISTINCT clause with a column that has NULL values, SQLite will keep one row of a NULL value.

Suppose you want to know the cities where the customers locate, you can use the SELECT statement to get data from the city column of the customers. You would get result of 59 rows containing a few duplicate rows such as Berlin London, and Mountain View.

Try getting all the unique cities of customers and order them by city. Print all the unique cities.

## For You Todo
distinct_city_query = “”””””
df_city = pd.read_sql_query(distinct_city_query, conn)

## Sometimes it is handy to just use the cursor.execute() method’s output….
## Let’s copy the above query here and see the output…
distinct_city_query = “”””””

cur = conn.cursor()
cur.execute(distinct_city_query)
cities = cur.fetchall()
for city in cities:
print(f”{city[0]} “, end = ”)

Now try getting all the unique cities and countries of customers and order them by country.

Note: SQLite uses the combination of city and country to evaluate the duplicate.

## For You Todo
cityCountry_query = “”””””
df_cityCountry = pd.read_sql_query(cityCountry_query, conn)
df_cityCountry.head(10)

4. The WHERE Clause¶
The WHERE clause is an optional clause of the SELECT statement. It appears after the FROM clause as the following statement:

column_list
search_condition;

In this example, you add a WHERE clause to the SELECT statement to filter rows returned by the query. When evaluating a SELECT statement with a WHERE clause, SQLite uses the following steps:

First, check the table in the FROM clause.
Second, evaluate the conditions in the WHERE clause to get the rows that met these conditions.
Third, make the final result set based on the rows in the previous step with columns in the SELECT clause.

The search condition in the WHERE has the following form:

left_expression COMPARISON_OPERATOR right_expression

For example, you can form a search condition as follows:

WHERE column_1 = 100;

WHERE column_2 IN (1,2,3);

WHERE column_3 LIKE ‘An%’;

WHERE column_4 BETWEEN 10 AND 20;

We will use the tracks table in the sample database to illustrate how to use the WHERE clause.

Write an SQL Query to find all the tracks that have an albumid of 1 and has a duration greater than 200,000 milliseconds. Try using the AND logical operator to combine both the constraints.

## For You Todo
track1_query = “”””””
df_track1 = pd.read_sql_query(track1_query, conn)
df_track1.set_index(‘TrackId’, inplace=True)
df_track1.head(10)

WHERE Clause with LIKE Operator¶
Sometimes, you may not remember exactly the data that you want to search. In this case, you perform an inexact search using the LIKE operator.

For this exercise find all the tracks composed by Smith and order them albumid.

## For You Todo
smith_query = “”””””
df_smith = pd.read_sql_query(smith_query, conn)
df_smith.set_index(‘TrackId’, inplace=True)
df_smith.head(10)

WHERE Clause with IN Operator¶
The IN operator allows you to check whether a value is in a list of a comma-separated list of values. For this exercise, try to find tracks that have media type id is 2 or 3.

## For You Todo
media23_query = “”””””
df_media23 = pd.read_sql_query(media23_query, conn)
df_media23.set_index(‘TrackId’, inplace=True)
df_media23.head(10)

5. The LIMIT Clause¶
The LIMIT clause is an optional part of the SELECT statement. You use the LIMIT clause to constrain the number of rows returned by the query.

For example, a SELECT statement may return one million rows. However, if you just need the first 10 rows in the result set, you can add the LIMIT clause to the SELECT statement to retrieve 10 rows.

The following illustrates the syntax of the LIMIT clause.

column_list
LIMIT row_count;

The row_count is a positive integer that specifies the number of rows returned.

If you want to get the first n rows starting from the mth row of the result set, you use OFFSET keyword as the following:

column_list
LIMIT row_count OFFSET offset;

Try to get the rows 60-80 in the tracks table and print them.

## For You Todo
tracks20_query = “”””””

cur = conn.cursor()
cur.execute(tracks20_query)
tracks = cur.fetchall()
for track in tracks:
print(track[0]),

6. The GROUP BY and HAVING Clauses¶
The HAVING clause is an optional clause of the SELECT statement. The HAVING clause specifies a search condition for a group.

You often use the HAVING clause with the GROUP BY clause. The GROUP BY clause groups a set of rows into a set of summary rows or groups. Then the HAVING clause filters groups based on a specified condition.

Note that the HAVING clause is applied after GROUP BY clause, whereas the WHERE clause is applied before the GROUP BY clause.

The following illustrates the syntax of the HAVING clause:

aggregate_function (column_3)
search_condition;

In this syntax, the HAVING clause evaluates the search_condition for each group as a Boolean expression. It only includes a group in the final result set if the evaluation is true.

We will use the tracks table in the sample database for this exercise.

Write an SQL query to find all albums that have the number of tracks between 18 and 20. After filtering, order the data by albumid. Use the COUNT aggregate function in the HAVING clause to count how many tracks are there in an album.

## For You Todo
album1820_query = “”””””
df_album1820 = pd.read_sql_query(album1820_query, conn)
df_album1820.set_index(‘TrackId’, inplace=True)
df_album1820

7. The JOIN Statement¶
1. INNER JOIN¶
For the demonstration, we will use the artists and albums tables from the sample database.

An artist can have zero or many albums while an album belongs to one artist.

To query data from both artists and albums tables, you use can use an INNER JOIN, LEFT JOIN, or CROSS JOIN clause. Each join clause determines how SQLite uses data from one table to match with rows in another table.

Note that SQLite doesn’t directly support the RIGHT JOIN and FULL OUTER JOIN.

The following statement returns the album titles and their artist names:

INNER JOIN artists
ON artists.ArtistId = albums.ArtistId;

In this example, the INNER JOIN clause matches each row from the albums table with every row from the artists table based on the join condition (artists.ArtistId = albums.ArtistId) specified after the ON keyword.

If the join condition evaluates to true (or 1), the columns of rows from both albums and artists tables are included in the result set.

inJoin_query = “”””””
df_inJoin = pd.read_sql_query(inJoin_query, conn)

2. LEFT JOIN¶
This statement selects the artist names and album titles from the artists and albums tables using the LEFT JOIN clause:

LEFT JOIN albums ON
artists.ArtistId = albums.ArtistId
ORDER BY Name;

The LEFT JOIN clause selects data starting from the left table (artists) and matching rows in the right table (albums) based on the join condition (artists.ArtistId = albums.ArtistId) .

The left join returns all rows from the artists table (or left table) and the matching rows from the albums table (or right table).

Try finding artists who don’t have any albums. You can add a WHERE clause to filter out the NULL values.

leftJoin_query = “”””””
df_leftJoin = pd.read_sql_query(inJoin_query, conn)
df_leftJoin

程序代写 CS代考加微信: powcoder QQ: 1823890830 Email: powcoder@163.com

Related Posts