SQL is a fundamental skill for data engineering and analytics. With the right understanding of a few key concepts, you can handle most of the tasks required in data manipulation and analysis. Let’s dive into the details of the SQL keywords and functions mentioned earlier.
Master the Most Important SQL Queries, Keywords, and Functions
SELECT
Purpose: Retrieve specific columns or all columns from a table in a database.Key Features:
- Allows fetching large datasets for analysis.
- Use
*
to retrieve all co
SELECT name, age, salary
FROM employees;
Retrieves the name
, age
, and salary
columns from the employees
table.
WHERE
Purpose: Filter rows based on specific conditions.Key Features:
- Helps in fetching only the required data by applying conditions.
- Can use operators like
=
,<
,>
,<=
,>=
,!=
,BETWEEN
,LIKE
, etc.
Example:
SELECT *
FROM sales
WHERE region = 'North America' AND sale_date > '2023-01-01';
Fetches all rows where the region is North America
and the sale date is after January 1, 2023.
ORDER BY
Purpose: Sort the result set based on one or more columns.Key Features:
- Default sorting is in ascending order (
ASC
), but you can explicitly useDESC
for descending order.
Example:
SELECT product_name, price
FROM products
ORDER BY price DESC;
Retrieves all products sorted by price in descending order.
Aggregation Functions (MIN, MAX, AVG, COUNT, SUM)
Purpose: Perform calculations on multiple rows of a table and return a single value.Key Features:
SUM
: Adds values.AVG
: Calculates the average.MIN
&MAX
: Find the smallest and largest values, respectively.COUNT
: Counts the number of rows.
Example:
SELECT COUNT(*) AS total_employees, AVG(salary) AS average_salary
FROM employees;
Returns the total number of employees and the average salary from the employees
table.
GROUP BY
Purpose: Group rows with the same values in specified columns and aggregate data.Key Features:
- Often used with aggregation functions.
- Groups data based on one or more columns.
Example:
SELECT department, SUM(salary) AS total_salary
FROM employees
GROUP BY department;
Calculates the total salary for each department in the employees
table.
JOINS (INNER, LEFT, RIGHT, FULL, SELF)
Purpose: Combine rows from two or more tables based on a related column.Key Features:
- INNER JOIN: Returns rows with matching values in both tables.
- LEFT JOIN: Returns all rows from the left table and matching rows from the right table.
- RIGHT JOIN: Returns all rows from the right table and matching rows from the left table.
- FULL JOIN: Returns all rows when there is a match in either table.
Example
SELECT employees.name, departments.department_name
FROM employees
INNER JOIN departments ON employees.department_id = departments.id;
Combines employees
and departments
tables to fetch employee names along with their respective department names.
Common Table Expressions (CTE)
Purpose: Simplify complex queries by breaking them into smaller, reusable parts.Key Features:
- Improves readability and maintainability of SQL queries.
- Defined using the
WITH
keyword.
Example
WITH SalesCTE AS (
SELECT region, SUM(sales) AS total_sales
FROM sales
GROUP BY region
)
SELECT *
FROM SalesCTE
WHERE total_sales > 100000;
First calculates total sales by region and then filters regions with sales above 100,000.
Working with DATETIME/TIMESTAMP
Purpose: Handle and manipulate date and time data.Key Features:
- Convert string dates to date format.
- Extract parts like year, month, day, or calculate differences.
Example
SELECT employee_name, EXTRACT(YEAR FROM hire_date) AS hire_year
FROM employees;
Extracts the year from the hire_date
column.
Numbering Functions (RANK, DENSE_RANK, ROW_NUMBER)
Purpose: Generate rankings or row numbers for rows in a result set.Key Features:
RANK
: Assigns ranks with gaps for ties.DENSE_RANK
: Assigns consecutive ranks even with ties.ROW_NUMBER
: Assigns a unique row number without gaps.
Example:
SELECT name, salary, RANK() OVER (ORDER BY salary DESC) AS rank
FROM employees;
Assigns a rank to each employee based on their salary in descending order.
Additional Keywords and Functions
IFNULL / COALESCE
Purpose: Handle NULL
values by replacing them with a default value.
Example:
SELECT name, IFNULL(bonus, 0) AS bonus
FROM employees;
CASTING
Purpose: Convert data types from one format to another.
Example
SELECT CAST(salary AS CHAR) AS salary_string
FROM employees;
MERGE STATEMENT
- Purpose: Perform
INSERT
,UPDATE
, orDELETE
operations in a single query. - Example:
MERGE INTO employees AS target
USING new_employees AS source
ON target.id = source.id
WHEN MATCHED THEN UPDATE SET target.salary = source.salary
WHEN NOT MATCHED THEN INSERT (id, name, salary) VALUES (source.id, source.name, source.salary);
QUALIFY
- Purpose: Filter rows after window functions are applied.
- Example:
SELECT name, salary, RANK() OVER (ORDER BY salary DESC) AS rank
FROM employees
QUALIFY rank <= 3;
By mastering these SQL queries, keywords, and functions, you’ll have the tools needed to efficiently handle data engineering tasks, build data pipelines, and prepare for SQL-based interviews.