SQL Interview Questions and Answers

Find 100+ SQL interview questions and answers to assess candidates' skills in queries, joins, indexing, database design, and performance optimization.

WeCP Team

Table of Content

Schedule A Demo Assess Candidate's Skills

As SQL remains a cornerstone of data management in 2025, recruiters need to identify professionals who can efficiently design, query, and optimize relational databases. With businesses relying on data-driven decision-making, SQL skills are essential for database administrators, backend developers, and data analysts.

This resource, "100+ SQL Interview Questions and Answers," is designed to help recruiters streamline candidate evaluations. It covers topics ranging from fundamentals to advanced concepts, including joins, indexing, query optimization, stored procedures, transactions, and database security.

Whether hiring junior analysts or experienced database architects, this guide enables you to assess a candidate’s:

Core SQL Knowledge: DDL, DML, and DCL commands, normalization, and constraints.
Advanced Skills: Query optimization, indexing strategies, stored procedures, and triggers.
Real-World Proficiency: Performance tuning, handling large datasets, and working with databases like MySQL, PostgreSQL, SQL Server, and Oracle.

For a streamlined assessment process, consider platforms like WeCP, which allow you to:

✅ Create customized SQL-based assessments with hands-on query tasks.
✅ Include real-world database challenges to assess problem-solving skills.
✅ Conduct remote proctored tests to ensure integrity.
✅ Use AI-driven evaluation for faster and more accurate hiring decisions.

Save time, enhance your recruitment process, and confidently hire SQL experts who can manage and optimize databases effectively from day one.

Easy Level SQL Questions (1-40)

What is SQL?
What are the different types of SQL commands?
What is the difference between WHERE and HAVING?
What is a primary key?
What is a foreign key?
Explain SELECT statement in SQL.
How do you retrieve unique values from a column using SQL?
What is the difference between DELETE and TRUNCATE?
What is the ORDER BY clause?
What is the use of LIMIT or TOP in SQL?
What is a JOIN in SQL?
What is the difference between INNER JOIN and LEFT JOIN?
How do you filter records in SQL?
Explain the concept of NULL values in SQL.
What is the use of the DISTINCT keyword in SQL?
How do you use the LIKE operator in SQL?
What is an index?
How can you add a new column to an existing table?
What is the GROUP BY clause used for in SQL?
How do you update a record in a table?
How do you insert data into a table in SQL?
What is the difference between COUNT and SUM functions?
How do you create a new table in SQL?
What are aggregate functions in SQL?
What is a VIEW in SQL?
What is the difference between VARCHAR and CHAR?
How do you delete records from a table in SQL?
How do you remove a table from a database?
How do you join two tables based on a foreign key?
What are aliases in SQL?
What does the MOD function do in SQL?
How do you retrieve data in ascending order in SQL?
How do you check if a value exists in a table using SQL?
What is the purpose of the IN operator in SQL?
What is the use of the BETWEEN operator in SQL?
How do you concatenate two strings in SQL?
How do you rename a table in SQL?
What is the difference between DROP and DELETE?
What are constraints in SQL?
How do you retrieve the first N records in SQL?

Intermediate Level SQL Questions (1-40)

What are subqueries in SQL? Explain with an example.
What is the difference between UNION and UNION ALL?
Explain the use of CASE in SQL.
How can you optimize a SQL query?
What is the EXISTS clause? How is it different from IN?
How do you use COALESCE in SQL?
What are window functions in SQL?
What is the use of RANK() and DENSE_RANK() functions?
Explain the CROSS JOIN in SQL.
What is the MERGE statement used for?
How do you find the second-highest salary in SQL?
What is the WITH clause used for in SQL?
How do you handle duplicate rows in SQL?
What is a CTE (Common Table Expression)?
Explain the difference between DELETE and TRUNCATE.
How do you calculate the total number of records in a table?
Explain how ROW_NUMBER() works in SQL.
What is the difference between COUNT(*) and COUNT(column)?
How do you convert a column's data type in SQL?
What is the use of the PIVOT clause in SQL?
How do you find records that exist in one table but not in another?
How do you update multiple columns in SQL?
What are transaction control commands in SQL?
Explain the use of COMMIT and ROLLBACK in SQL.
What is normalization in SQL?
What is a self-join?
How do you select rows with a specific condition using HAVING?
What is a correlated subquery?
What is the difference between CHAR and VARCHAR in SQL?
How do you retrieve the current date and time in SQL?
Explain the difference between SINGLE JOIN and MULTIPLE JOIN.
How can you use GROUP_CONCAT() function in SQL?
How do you write a SQL query to find all employees who joined in the last 6 months?
What is the use of TRIGGER in SQL?
Explain ACID properties in SQL.
How do you implement pagination in SQL?
How do you create a temporary table in SQL?
How do you perform a full outer join in SQL?
What is referential integrity in SQL?
How can you avoid SQL injection attacks?

Experience Level SQL Questions (1-40)

Explain indexing and how it can improve query performance.
How do you perform a database migration?
What are the different types of indexes?
What is a clustered index, and how does it differ from a non-clustered index?
How do you optimize slow-running queries in SQL?
What is query execution plan, and how can you analyze it?
Explain partitioning in SQL databases.
What is the purpose of database sharding?
How do you monitor performance in SQL databases?
What are database locks, and how do they affect queries?
How do you handle deadlocks in SQL?
Explain the NOLOCK hint and when it should be used.
What is database replication, and how does it work?
What are materialized views, and how do they differ from regular views?
How do you enforce referential integrity in SQL?
How do you manage large datasets in SQL?
What is the use of the LEAD() and LAG() functions in SQL?
Explain dynamic SQL.
How can you design a database schema for high scalability?
What is the difference between vertical and horizontal scaling in databases?
How do you perform bulk inserts in SQL?
How do you handle error handling in stored procedures?
What is data warehousing, and how is it different from transactional databases?
Explain how to implement indexing strategies for large datasets.
How do you create a stored procedure?
What is the role of database isolation levels?
Explain the difference between snapshot isolation and serializable isolation.
How do you track changes in a SQL database?
How do you write efficient queries for OLAP systems?
What is a surrogate key?
What are the pros and cons of using triggers in SQL databases?
Explain the difference between soft delete and hard delete.
How do you implement concurrency in SQL?
How do you ensure data consistency in distributed databases?
What is a query optimizer, and how does it work?
How do you manage backups and recovery in SQL?
Explain how indexing can negatively affect performance.
How do you handle large-scale data migrations in SQL?
What are recursive queries, and how can they be used?
Explain the concept of eventual consistency in distributed databases.

Answer Easy level questions:

1. What is SQL?

SQL (Structured Query Language) is a standard programming language used to manage and manipulate relational databases. It allows you to query, update, insert, and delete data, as well as define, manipulate, and control access to the database schema.

2. What are the different types of SQL commands?

SQL commands are divided into five categories:

DDL (Data Definition Language): Used to define database structures. Examples: CREATE, ALTER, DROP.
DML (Data Manipulation Language): Used to manipulate data in tables. Examples: SELECT, INSERT, UPDATE, DELETE.
DQL (Data Query Language): Primarily consists of the SELECT statement used to query data.
DCL (Data Control Language): Controls access to the data. Examples: GRANT, REVOKE.
TCL (Transaction Control Language): Manages database transactions. Examples: COMMIT, ROLLBACK.

3. What is the difference between WHERE and HAVING?

WHERE: Filters rows before grouping (used with SELECT, UPDATE, and DELETE statements).
HAVING: Filters rows after grouping (used only with GROUP BY).

Example:

SELECT department, COUNT(*)
FROM employees
WHERE salary > 30000
GROUP BY department
HAVING COUNT(*) > 5;

4. What is a primary key?

A primary key is a column (or a combination of columns) in a table that uniquely identifies each row in that table. It must contain unique, non-null values.

5. What is a foreign key?

A foreign key is a column or a set of columns that establishes a link between two tables. It refers to the primary key in another table, enforcing a relationship between the tables.

6. Explain SELECT statement in SQL.

The SELECT statement is used to retrieve data from one or more tables in a database. It can include filtering, sorting, and grouping data.

Example:

SELECT name, age FROM students;

7. How do you retrieve unique values from a column using SQL?

You can use the DISTINCT keyword to retrieve unique values from a column.

Example:

SELECT DISTINCT city FROM customers;

8. What is the difference between DELETE and TRUNCATE?

DELETE: Removes rows from a table one by one and can be filtered with a WHERE clause. It is a DML command and can be rolled back.
TRUNCATE: Removes all rows from a table without logging individual row deletions. It is faster but cannot be rolled back in most systems. It’s a DDL command.

9. What is the ORDER BY clause?

The ORDER BY clause is used to sort the result set in either ascending (ASC, default) or descending (DESC) order.

Example:

SELECT name, age FROM students ORDER BY age DESC;

10. What is the use of LIMIT or TOP in SQL?

The LIMIT (or TOP in some databases) clause restricts the number of rows returned by the query.

Example (MySQL):

SELECT * FROM employees LIMIT 5;

Example (SQL Server):

SELECT TOP 5 * FROM employees;

11. What is a JOIN in SQL?

A JOIN is used to combine rows from two or more tables based on a related column between them. Common types of joins are:

INNER JOIN: Returns only matching rows.
LEFT JOIN: Returns all rows from the left table and matched rows from the right.
RIGHT JOIN: Returns all rows from the right table and matched rows from the left.
FULL JOIN: Returns rows when there is a match in either table.

12. What is the difference between INNER JOIN and LEFT JOIN?

INNER JOIN: Returns only the rows that have matching values in both tables.
LEFT JOIN: Returns all rows from the left table, along with matching rows from the right table. If no match exists, NULL values are returned for columns from the right table.

13. How do you filter records in SQL?

To filter records, you use the WHERE clause to specify conditions that must be met.

Example:

SELECT * FROM employees WHERE department = 'HR';

14. Explain the concept of NULL values in SQL.

NULL represents missing or unknown data in SQL. It is different from zero or empty string. You cannot use equality (=) to compare NULL values; instead, use IS NULL or IS NOT NULL.

15. What is the use of the DISTINCT keyword in SQL?

The DISTINCT keyword is used to remove duplicates and return only unique values.

Example:

SELECT DISTINCT department FROM employees;

16. How do you use the LIKE operator in SQL?

The LIKE operator is used in a WHERE clause to search for a specified pattern in a column. It often uses % for wildcards (any number of characters) and _ for a single character.

Example:

SELECT * FROM employees WHERE name LIKE 'A%';  -- Starts with 'A'

17. What is an index?

An index improves the speed of data retrieval on a database table. Indexes can be created on one or more columns to optimize query performance, but they may slow down INSERT and UPDATE operations due to the overhead of maintaining the index.

18. How can you add a new column to an existing table?

To add a new column, use the ALTER TABLE statement.

Example:

ALTER TABLE employees ADD birth_date DATE;

19. What is the GROUP BY clause used for in SQL?

The GROUP BY clause is used to group rows that have the same values in specified columns into summary rows, often used with aggregate functions like COUNT, SUM, AVG, etc.

Example:

SELECT department, COUNT(*) FROM employees GROUP BY department;

20. How do you update a record in a table?

Use the UPDATE statement to modify existing records in a table.

Example:

UPDATE employees SET salary = 50000 WHERE id = 101;

21. How do you insert data into a table in SQL?

To insert data into a table, use the INSERT INTO statement.

Example:

INSERT INTO employees (name, department, salary) VALUES ('John', 'HR', 60000);

22. What is the difference between COUNT and SUM functions?

COUNT: Returns the number of rows that match the criteria.
SUM: Returns the total sum of a numeric column.

23. How do you create a new table in SQL?

Use the CREATE TABLE statement to create a new table.

Example:

CREATE TABLE students (
    id INT PRIMARY KEY,
    name VARCHAR(50),
    age INT
);

24. What are aggregate functions in SQL?

Aggregate functions perform calculations on multiple rows and return a single value. Examples: COUNT, SUM, AVG, MAX, MIN.

25. What is a VIEW in SQL?

A VIEW is a virtual table based on the result of a SELECT query. It does not store data but presents data from one or more tables.

Example:

CREATE VIEW employee_view AS
SELECT name, department FROM employees WHERE salary > 40000;

26. What is the difference between VARCHAR and CHAR?

CHAR: Fixed-length string (e.g., CHAR(10) stores exactly 10 characters).
VARCHAR: Variable-length string (e.g., VARCHAR(50) stores up to 50 characters but uses only the required space).

27. How do you delete records from a table in SQL?

Use the DELETE statement to remove specific rows.

Example:

DELETE FROM employees WHERE id = 101;

28. How do you remove a table from a database?

Use the DROP TABLE statement to remove a table and its data.

Example:

DROP TABLE students;

29. How do you join two tables based on a foreign key?

Use an INNER JOIN to join two tables based on the foreign key relationship.

Example:

SELECT employees.name, departments.name
FROM employees
INNER JOIN departments ON employees.department_id = departments.id;

30. What are aliases in SQL?

Aliases are temporary names given to tables or columns to make queries more readable.

Example:

SELECT e.name AS employee_name, d.name AS department_name
FROM employees e
INNER JOIN departments d ON e.department_id = d.id;

31. What does the MOD function do in SQL?

The MOD function returns the remainder of a division operation.

Example:

SELECT MOD(10, 3);  -- Output: 1

32. How do you retrieve data in ascending order in SQL?

Use the ORDER BY clause.

Example:

SELECT name FROM students ORDER BY age ASC;

33. How do you check if a value exists in a table using SQL?

Use the EXISTS or IN clause.

Example:

SELECT * FROM employees WHERE EXISTS (SELECT 1 FROM departments WHERE id = employees.department_id);

34. What is the purpose of the IN operator in SQL?

The IN operator is used to filter rows where a column's value matches any value in a list.

Example:

SELECT * FROM employees WHERE department IN ('HR', 'IT', 'Sales');

35. What is the use of the BETWEEN operator in SQL?

The BETWEEN operator filters rows where a column's value lies within a specified range (inclusive).

Example:

SELECT * FROM employees WHERE salary BETWEEN 30000 AND 50000;

36. How do you concatenate two strings in SQL?

Use the CONCAT function (or || in some databases).

Example:

SELECT CONCAT(first_name, ' ', last_name) AS full_name FROM employees;

37. How do you rename a table in SQL?

Use the ALTER TABLE statement.

Example:

ALTER TABLE old_table_name RENAME TO new_table_name;

38. What is the difference between DROP and DELETE?

DELETE: Removes rows from a table.
DROP: Removes the entire table (structure and data).

39. What are constraints in SQL?

Constraints are rules enforced on data columns to maintain data integrity. Examples include:

PRIMARY KEY
FOREIGN KEY
UNIQUE
NOT NULL
CHECK

40. How do you retrieve the first N records in SQL?

Use LIMIT in MySQL or TOP in SQL Server.

Example (MySQL):

SELECT * FROM employees LIMIT 10;

Example (SQL Server):

SELECT TOP 10 * FROM employees;

Intermediate Level ans :

1. What are subqueries in SQL? Explain with an example.

A subquery (or inner query) is a query nested inside another query. Subqueries are used to retrieve data that will be used by the main (outer) query.

Example:

SELECT name
FROM employees
WHERE salary > (SELECT AVG(salary) FROM employees);

In this example, the subquery (SELECT AVG(salary) FROM employees) calculates the average salary, and the outer query selects employees whose salary is greater than the average.

2. What is the difference between UNION and UNION ALL?

UNION: Combines the result sets of two or more queries, but removes duplicates.
UNION ALL: Combines the result sets of two or more queries, including duplicates.

Example:

SELECT name FROM employees1
UNION
SELECT name FROM employees2;  -- Duplicates removed

SELECT name FROM employees1
UNION ALL
SELECT name FROM employees2;  -- Duplicates included

3. Explain the use of CASE in SQL.

The CASE statement allows you to perform conditional logic in SQL, similar to IF-THEN-ELSE in other languages.

Example:

SELECT name, 
       salary,
       CASE
           WHEN salary > 50000 THEN 'High'
           WHEN salary BETWEEN 30000 AND 50000 THEN 'Medium'
           ELSE 'Low'
       END AS salary_category
FROM employees;

In this example, the CASE statement categorizes the salary into 'High', 'Medium', and 'Low'.

4. How can you optimize a SQL query?

Some ways to optimize SQL queries include:

Indexing: Add indexes to columns used in WHERE, JOIN, and ORDER BY clauses.
Avoiding SELECT *: Select only the required columns instead of using SELECT *.
Avoiding subqueries: Replace subqueries with JOINs where possible.
Use EXISTS instead of IN: In large datasets, EXISTS performs better than IN.
Query execution plan: Analyze the execution plan using EXPLAIN or ANALYZE to identify bottlenecks.

5. What is the EXISTS clause? How is it different from IN?

The EXISTS clause is used to check if a subquery returns any rows. It returns TRUE if the subquery returns at least one row. IN checks if a value exists in a list or result set.

Difference:

EXISTS: Stops checking once a match is found, making it faster in larger datasets.
IN: Compares all values, which can be slower with large lists.

Example:

SELECT * FROM employees WHERE EXISTS (SELECT 1 FROM departments WHERE departments.id = employees.department_id);

6. How do you use COALESCE in SQL?

COALESCE returns the first non-NULL value from a list of expressions. It's useful for handling NULL values in queries.

Example:

SELECT name, COALESCE(phone, 'No Phone') AS phone
FROM employees;

In this query, if the phone column has a NULL value, it will return 'No Phone' instead.

7. What are window functions in SQL?

Window functions perform calculations across a set of rows related to the current row without collapsing them into a single result. Examples include ROW_NUMBER(), RANK(), LEAD(), and LAG().

Example:

SELECT name, salary, 
       RANK() OVER (ORDER BY salary DESC) AS salary_rank
FROM employees;

Here, the RANK() function assigns a rank to each employee based on their salary, without collapsing the result set.

8. What is the use of RANK() and DENSE_RANK() functions?

RANK(): Assigns a unique rank to rows, but leaves gaps if there are ties.
DENSE_RANK(): Assigns consecutive ranks to rows, without gaps, even if there are ties.

Example:

SELECT name, salary,
       RANK() OVER (ORDER BY salary DESC) AS rank,
       DENSE_RANK() OVER (ORDER BY salary DESC) AS dense_rank
FROM employees;

9. Explain the CROSS JOIN in SQL.

A CROSS JOIN returns the Cartesian product of two tables, i.e., it pairs each row from the first table with every row from the second table. It's rarely used directly because it results in large datasets.

Example:

SELECT * FROM employees CROSS JOIN departments;

This query returns all possible combinations of rows from employees and departments.

10. What is the MERGE statement used for?

The MERGE statement allows you to perform INSERT, UPDATE, or DELETE operations in a single statement based on conditions, typically used for upserting data.

Example:

MERGE INTO employees AS target
USING new_employees AS source
ON target.id = source.id
WHEN MATCHED THEN
    UPDATE SET target.salary = source.salary
WHEN NOT MATCHED THEN
    INSERT (id, name, salary) VALUES (source.id, source.name, source.salary);

11. How do you find the second-highest salary in SQL?

You can use the LIMIT or ROW_NUMBER() functions to find the second-highest salary.

Example (using LIMIT):

SELECT salary 
FROM employees 
ORDER BY salary DESC 
LIMIT 1 OFFSET 1;  -- Skip the highest salary and get the second-highest

Example (using ROW_NUMBER()):

WITH RankedSalaries AS (
    SELECT salary, ROW_NUMBER() OVER (ORDER BY salary DESC) AS rank
    FROM employees
)
SELECT salary FROM RankedSalaries WHERE rank = 2;

12. What is the WITH clause used for in SQL?

The WITH clause (also known as Common Table Expressions, or CTEs) allows you to define temporary result sets that can be referenced within the main query.

Example:

WITH AvgSalary AS (
    SELECT department, AVG(salary) AS avg_salary
    FROM employees
    GROUP BY department
)
SELECT * FROM employees e
JOIN AvgSalary a ON e.department = a.department
WHERE e.salary > a.avg_salary;

13. How do you handle duplicate rows in SQL?

To remove duplicates, use the DISTINCT keyword or identify and delete them using ROW_NUMBER() or other ranking/window functions.

Example (using DISTINCT):

SELECT DISTINCT name, department FROM employees;

Example (deleting duplicates using ROW_NUMBER()):

WITH CTE AS (
    SELECT *, ROW_NUMBER() OVER (PARTITION BY name, department ORDER BY id) AS row_num
    FROM employees
)
DELETE FROM CTE WHERE row_num > 1;

14. What is a CTE (Common Table Expression)?

A CTE is a temporary result set that is defined within a WITH clause and can be referenced later in the query. It makes complex queries more readable and reusable.

Example:

WITH EmployeeSalary AS (
    SELECT name, salary FROM employees
)
SELECT * FROM EmployeeSalary WHERE salary > 50000;

15. Explain the difference between DELETE and TRUNCATE.

DELETE: Removes rows from a table based on a WHERE clause. It logs each row's deletion and can be rolled back (transaction-safe).
TRUNCATE: Quickly removes all rows from a table by deallocating the data pages. It cannot be rolled back in many databases and resets the table’s identity column.

16. How do you calculate the total number of records in a table?

You can use the COUNT(*) function to calculate the total number of records.

Example:

SELECT COUNT(*) FROM employees;

17. Explain how ROW_NUMBER() works in SQL.

The ROW_NUMBER() function assigns a unique sequential integer to rows within a result set. It is often used for pagination or identifying duplicates.

Example:

SELECT name, salary,
       ROW_NUMBER() OVER (ORDER BY salary DESC) AS row_num
FROM employees;

**18. What is the difference between COUNT(*) and COUNT(column)?**

COUNT(*): Counts all rows in the table, including rows with NULL values.
COUNT(column): Counts non-NULL values in the specified column.

19. How do you convert a column's data type in SQL?

Use the CAST() or CONVERT() functions to convert a column's data type.

Example:

SELECT name, CAST(salary AS VARCHAR) FROM employees;

SELECT name, CONVERT(VARCHAR, salary) FROM employees;

20. What is the HAVING clause and how is it different from WHERE?

WHERE: Filters rows before aggregation.
HAVING: Filters rows after aggregation, usually used with GROUP BY.

Example:

SELECT department, COUNT(*)
FROM employees
GROUP BY department
HAVING COUNT(*) > 5;

21. How do you add a new column to an existing table?

Use the ALTER TABLE statement.

Example:

ALTER TABLE employees ADD birth_date DATE;

22. What is the use of GROUP BY in SQL?

The GROUP BY clause groups rows that have the same values in specified columns into aggregated data. It is commonly used with aggregate functions.

Example:

SELECT department, COUNT(*) FROM employees GROUP BY department;

23. What is a self-join?

A self-join is a join where a table is joined with itself. It is useful when comparing rows within the same table.

Example:

SELECT e1.name, e2.name
FROM employees e1
JOIN employees e2 ON e1.manager_id = e2.id;

24. How do you delete duplicate rows in SQL?

Use ROW_NUMBER() to identify duplicates and then delete them.

Example:

WITH CTE AS (
    SELECT name, department, ROW_NUMBER() OVER (PARTITION BY name, department ORDER BY id) AS row_num
    FROM employees
)
DELETE FROM CTE WHERE row_num > 1;

25. What are scalar functions in SQL?

Scalar functions operate on a single value and return a single value. Examples include LEN(), UPPER(), LOWER(), ROUND().

Example:

SELECT UPPER(name) FROM employees;

26. What is a correlated subquery?

A correlated subquery is a subquery that depends on the outer query for its values. It is evaluated once for each row processed by the outer query.

Example:

SELECT name, salary
FROM employees e
WHERE salary > (SELECT AVG(salary) FROM employees WHERE department_id = e.department_id);

27. What is a materialized view?

A materialized view stores the result of a query physically, unlike a regular view, which only stores the query. This improves performance for complex queries but requires periodic refreshes.

28. How do you update records in a table based on another table’s values?

Use an UPDATE with a JOIN.

Example:

UPDATE e
SET e.salary = n.salary
FROM employees e
JOIN new_salaries n ON e.id = n.id;

29. What is a PIVOT table in SQL?

A PIVOT table converts rows into columns. It’s often used for summarizing data.

Example:

SELECT *
FROM (SELECT department, salary FROM employees) AS SourceTable
PIVOT (SUM(salary) FOR department IN ([HR], [IT], [Sales])) AS PivotTable;

30. What is the difference between ANY and ALL operators?

ANY: Returns TRUE if any of the values in the subquery meet the condition.
ALL: Returns TRUE only if all values meet the condition.

Example (ANY):

SELECT name FROM employees WHERE salary > ANY (SELECT salary FROM managers);

Example (ALL):

SELECT name FROM employees WHERE salary > ALL (SELECT salary FROM managers);

31. How do you use the LEAD() function in SQL?

The LEAD() function provides access to a row at a specific physical offset following the current row in the result set.

Example:

SELECT name, salary, LEAD(salary, 1) OVER (ORDER BY salary DESC) AS next_salary
FROM employees;

32. What is the purpose of the NULLIF() function in SQL?

NULLIF() returns NULL if the two expressions are equal, otherwise it returns the first expression.

Example:

SELECT NULLIF(salary, 0) FROM employees;

33. How do you add an index to a table?

Use the CREATE INDEX statement to add an index.

Example:

CREATE INDEX idx_salary ON employees(salary);

34. What is the ROLLUP operator in SQL?

The ROLLUP operator creates subtotals and a grand total across multiple levels of aggregation.

Example:

SELECT department, COUNT(*)
FROM employees
GROUP BY ROLLUP(department);

35. How do you calculate the cumulative sum in SQL?

Use the SUM() window function with OVER().

Example:

SELECT name, salary, 
       SUM(salary) OVER (ORDER BY salary) AS cumulative_salary
FROM employees;

36. What is a FULL OUTER JOIN?

A FULL OUTER JOIN returns all rows when there is a match in either table. If there is no match, NULL values are returned.

Example:

SELECT * FROM employees e
FULL OUTER JOIN departments d ON e.department_id = d.id;

37. How do you perform pagination in SQL?

Use LIMIT and OFFSET for MySQL, or ROW_NUMBER() for other databases.

Example (MySQL):

SELECT * FROM employees LIMIT 10 OFFSET 20;  -- Page 3, with 10 rows per page

Example (SQL Server):

WITH EmployeeCTE AS (
    SELECT name, salary, ROW_NUMBER() OVER (ORDER BY salary DESC) AS row_num
    FROM employees
)
SELECT * FROM EmployeeCTE WHERE row_num BETWEEN 21 AND 30;  -- Page 3, with 10 rows per page

38. What is the difference between CHAR and VARCHAR?

CHAR: Fixed-length character data type. Always reserves the specified space, even if the actual value is shorter.
VARCHAR: Variable-length character data type. Only uses as much space as required for the actual value.

39. How do you remove a column from a table in SQL?

Use the ALTER TABLE statement with the DROP COLUMN clause.

Example:

ALTER TABLE employees DROP COLUMN birth_date;

40. How do you find duplicate records in SQL?

Use GROUP BY with HAVING to find duplicate records.

Example:

SELECT name, COUNT(*)
FROM employees
GROUP BY name
HAVING COUNT(*) > 1;

Experience Level ans:

1. Explain database normalization and its different forms.

Database normalization is the process of organizing data in a database to reduce redundancy and improve data integrity. It involves dividing large tables into smaller, more manageable ones and defining relationships between them to ensure the database is structured efficiently.

The main forms of normalization are:

First Normal Form (1NF):some text
- Ensures that all values in a table are atomic (indivisible).
- Each column contains only one value, and there are no repeating groups.

Example:

-- Non-1NF table
| ID  | Name  | Phone Numbers         |
|-----|-------|-----------------------|
| 1   | John  | 123-456, 789-123      |

-- 1NF table (atomic values)
| ID  | Name  | Phone Number          |
|-----|-------|-----------------------|
| 1   | John  | 123-456               |
| 1   | John  | 789-123               |

Second Normal Form (2NF):some text
- Satisfies 1NF and ensures that every non-key attribute is fully dependent on the primary key.
- Removes partial dependency by eliminating any dependency of a non-key attribute on part of a composite key.

Example:

-- Non-2NF table
| StudentID | CourseID | CourseName | Grade |
|-----------|----------|------------|-------|

-- 2NF tables (removing partial dependency)
| CourseID | CourseName |
|----------|------------|

| StudentID | CourseID | Grade |

Third Normal Form (3NF):some text
- Satisfies 2NF and ensures that all non-key attributes are dependent only on the primary key.
- Eliminates transitive dependencies (dependencies between non-key attributes).

Example:

-- Non-3NF table
| EmployeeID | DepartmentID | DepartmentName | EmployeeName |

-- 3NF tables (eliminating transitive dependency)
| DepartmentID | DepartmentName |
|--------------|----------------|

| EmployeeID | DepartmentID | EmployeeName |

Boyce-Codd Normal Form (BCNF):some text
- A stricter version of 3NF where every determinant must be a candidate key.
- Used to handle certain anomalies not addressed by 3NF.

Normalization helps to avoid data anomalies such as update, delete, and insert anomalies by organizing data logically.

‍2. What are indexes in SQL? Explain the types of indexes.

Indexes are special lookup tables that the database management system uses to speed up data retrieval. They are created on columns to reduce the time required to search for rows. An index is like a book’s index, which helps to quickly locate specific data without having to search through every page.There are two main types of indexes:

Clustered Index:
- Alters the physical order of the data in the table to match the index order.
- A table can have only one clustered index because data rows can be stored in only one order.
- The primary key usually creates a clustered index by default.

Example:

CREATE CLUSTERED INDEX idx_employee_id ON employees(id);

Non-Clustered Index:some text
- Does not change the physical order of the data. Instead, it creates a separate object within the table that holds the index and points back to the original data.
- A table can have multiple non-clustered indexes.

Example:

CREATE NONCLUSTERED INDEX idx_employee_name ON employees(name);

Other types of indexes include:

Unique Index: Ensures all values in the indexed column are unique.
Composite Index: An index created on two or more columns.
Full-text Index: Allows text-based searching for complex queries in textual data.

Indexes improve query performance, but they come at the cost of slower data modifications (like INSERT, UPDATE, and DELETE), as the index needs to be updated.

‍3. Explain the ACID properties in SQL.

ACID properties (Atomicity, Consistency, Isolation, Durability) are a set of properties that ensure reliable processing of database transactions. They maintain the integrity of a database, even in the event of errors, power failures, or crashes.

Atomicity:some text
- Ensures that each transaction is treated as a single unit, which either succeeds completely or fails completely. If one part of the transaction fails, the entire transaction fails, and the database is left unchanged.

Example:

BEGIN TRANSACTION;
UPDATE accounts SET balance = balance - 100 WHERE account_id = 1;
UPDATE accounts SET balance = balance + 100 WHERE account_id = 2;
COMMIT;  -- If either UPDATE fails, the entire transaction is rolled back.

Consistency:some text
- Ensures that a transaction brings the database from one valid state to another, maintaining all predefined rules such as constraints, triggers, and cascades.
- After the transaction completes, all integrity constraints are preserved.

Example:

-- If a transaction violates a constraint (e.g., unique or foreign key), the transaction is rolled back.

Isolation:some text
- Ensures that the concurrent execution of transactions leads to the same state as if the transactions were executed sequentially. It prevents one transaction from interfering with another.
- The isolation levels (e.g., READ UNCOMMITTED, READ COMMITTED, REPEATABLE READ, SERIALIZABLE) control the visibility of data between concurrent transactions.

Example:

-- Two transactions cannot read the same data until the first transaction is complete, depending on the isolation level.

Durability:some text
- Ensures that once a transaction is committed, its changes are permanent, even in the event of a system crash. The committed data will not be lost.

Example:

-- After COMMIT, the changes made by the transaction are saved permanently in the database.

The ACID properties guarantee the reliability and consistency of databases, especially in multi-user environments.

‍4. What are SQL transactions? Explain COMMIT, ROLLBACK, and SAVEPOINT.

‍A transaction in SQL is a sequence of one or more SQL operations treated as a single unit of work. A transaction ensures data consistency and integrity by following ACID properties. COMMIT: Finalizes a transaction by making all the changes permanent in the database.

Example:

BEGIN TRANSACTION;
UPDATE accounts SET balance = balance - 100 WHERE account_id = 1;
UPDATE accounts SET balance = balance + 100 WHERE account_id = 2;
COMMIT;  -- Makes all changes permanent.

ROLLBACK: Undoes all the changes made in the current transaction, reverting the database to its previous state before the transaction started.

Example:

BEGIN TRANSACTION;
UPDATE accounts SET balance = balance - 100 WHERE account_id = 1;
UPDATE accounts SET balance = balance + 100 WHERE account_id = 2;
ROLLBACK;  -- Reverts all changes made during the transaction.

SAVEPOINT: Creates intermediate points within a transaction where you can roll back to a specific savepoint instead of the entire transaction.

Example:

BEGIN TRANSACTION;
UPDATE accounts SET balance = balance - 100 WHERE account_id = 1;
SAVEPOINT before_transfer;
UPDATE accounts SET balance = balance + 100 WHERE account_id = 2;
ROLLBACK TO before_transfer;  -- Rolls back only to the savepoint.

5. What is a deadlock in SQL, and how can it be resolved?

‍A deadlock occurs when two or more transactions block each other by holding locks on resources that the other transactions need. Each transaction is waiting for the other to release the lock, resulting in a situation where neither transaction can proceed.Example of a deadlock:

Transaction 1 locks Table A and tries to access Table B.
Transaction 2 locks Table B and tries to access Table A. Both transactions are now stuck in a deadlock.

Resolution strategies: Deadlock detection: The database detects the deadlock automatically and aborts one of the transactions, rolling back its changes. The other transaction can then proceed.
Example:

-- SQL Server automatically detects deadlocks and resolves them by terminating one of the processes.

Timeout: The database aborts a transaction if it is waiting too long for a lock.

Example:

SET LOCK_TIMEOUT 10000;  -- Sets the lock timeout to 10 seconds.

Locking strategies:some text
- Avoiding exclusive locks: Use appropriate transaction isolation levels to avoid unnecessary locks.
- Lock ordering: Ensure that transactions acquire locks in the same order to prevent circular waits.

6. Explain the difference between TRUNCATE, DELETE, and DROP.

DELETE: Removes rows from a table based on a condition. It can be used with a WHERE clause to delete specific records. It logs individual row deletions and can be rolled back if the transaction is not committed.

Example:

DELETE FROM employees WHERE department = 'HR';

TRUNCATE: Removes all rows from a table but does not log individual row deletions. It is faster than DELETE because it does not generate as much logging and cannot be rolled back.

Example:

TRUNCATE TABLE employees;

DROP: Removes a table or database entirely, along with all its data and structure. It cannot be rolled back.

Example:

DROP TABLE employees;

Key differences:

DELETE allows for row-by-row removal and can be rolled back.
TRUNCATE removes all rows and cannot be rolled back, but retains the table structure.
DROP completely removes the table from the database.

7. What is partitioning in SQL, and what are its types?

Partitioning is the process of dividing a large table into smaller, more manageable pieces, called partitions. It helps improve query performance by limiting the amount of data that needs to be scanned and simplifies data management for large datasets.Types of partitioning:

Range Partitioning:some text
- Divides the data into partitions based on a range of values in a column (e.g., dates, numbers).

Example:

CREATE TABLE sales (
    id INT,
    sale_date DATE,
    amount DECIMAL
)
PARTITION BY RANGE (sale_date) (
    PARTITION p1 VALUES LESS THAN ('2022-01-01'),
    PARTITION p2 VALUES LESS THAN ('2023-01-01')
);

List Partitioning:
- Divides data based on a list of discrete values.

Example:

CREATE TABLE employees (
    id INT,
    department VARCHAR(50)
)
PARTITION BY LIST (department) (
    PARTITION p1 VALUES IN ('HR', 'Finance'),
    PARTITION p2 VALUES IN ('IT', 'Sales')
);

Hash Partitioning:some text
- Distributes data across partitions based on the result of a hash function.

Example:

CREATE TABLE users (
    id INT,
    name VARCHAR(100)
)
PARTITION BY HASH (id);

Composite Partitioning:some text
- Combines two or more types of partitioning (e.g., range and list).

Partitioning helps in optimizing query performance, especially for large datasets, by allowing the database to search only in specific partitions instead of the entire table.

‍8. What are SQL triggers? Explain their types.

‍A trigger is a special type of stored procedure in SQL that automatically executes (or “fires”) when certain events occur in the database, such as INSERT, UPDATE, or DELETE operations on a table.Types of triggers:

Before Triggers:some text
- Execute before the actual operation (INSERT, UPDATE, DELETE) takes place.

Example:

CREATE TRIGGER before_insert_employee
BEFORE INSERT ON employees
FOR EACH ROW
SET NEW.salary = IF(NEW.salary < 1000, 1000, NEW.salary);

After Triggers:
- Execute after the actual operation is completed.

Example:

CREATE TRIGGER after_update_salary
AFTER UPDATE ON employees
FOR EACH ROW
INSERT INTO audit_log (employee_id, old_salary, new_salary)
VALUES (OLD.id, OLD.salary, NEW.salary);

Instead of Triggers (used in views):some text
- Execute instead of the actual operation, useful for updating or inserting into views.

Example:

CREATE TRIGGER instead_of_insert
INSTEAD OF INSERT ON employee_view
FOR EACH ROW
INSERT INTO employees (name, department)
VALUES (NEW.name, NEW.department);

Triggers are useful for enforcing business rules, logging changes, and maintaining data integrity, but they can impact performance if not used carefully.

‍9. What is query optimization, and why is it important?

Query optimization is the process of improving the performance of SQL queries to ensure they execute in the most efficient way possible. The goal is to minimize resource usage (such as CPU, memory, and I/O) and reduce the execution time of queries.Reasons for query optimization:

Improves performance: Optimized queries run faster, making the application more responsive.
Reduces resource consumption: Efficient queries use fewer system resources (memory, CPU), reducing the load on the database.
Handles large datasets: Optimization is critical for large databases where inefficient queries can cause significant slowdowns.

Common techniques for query optimization:

Indexes: Create indexes on frequently queried columns to speed up searches.
Avoid SELECT *: Select only the columns needed, as SELECT * retrieves unnecessary data.
Use appropriate joins: Ensure you’re using the correct type of join (INNER, LEFT, etc.) based on the data requirements.
Limit results: Use LIMIT or TOP to restrict the number of rows returned.
Use query plans: Analyze query plans to see how the database executes the query and identify bottlenecks.

Query optimization is crucial for maintaining fast, scalable, and efficient database systems.

‍10. Explain SQL Server isolation levels.

‍Isolation levels in SQL Server determine how transaction integrity is maintained and how locks are managed between concurrent transactions. They control the visibility of uncommitted changes made by other transactions.The five isolation levels are:

Read Uncommitted:some text
- No locks are placed on data being read, allowing dirty reads (i.e., reading uncommitted data from other transactions).

Example:

SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;

Read Committed:some text
- Default level. Prevents dirty reads by ensuring that only committed data can be read. Shared locks are placed on data being read.

Example:

SET TRANSACTION ISOLATION LEVEL READ COMMITTED;

Repeatable Read:some text
- Prevents dirty reads and ensures that if a row is read multiple times in a transaction, it will not change (no non-repeatable reads), but phantom reads are still possible.

Example:

SET TRANSACTION ISOLATION LEVEL REPEATABLE READ;

Serializable:some text
- The strictest level, preventing dirty reads, non-repeatable reads, and phantom reads. Ensures complete isolation by locking entire ranges of data, making it the most resource-intensive.

Example:

SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;

Snapshot:some text
- Provides a snapshot of the data as it existed at the start of the transaction, avoiding locks but still preventing dirty reads.

Example:

SET TRANSACTION ISOLATION LEVEL SNAPSHOT;

11. Explain the difference between INNER JOIN and OUTER JOIN in SQL.

INNER JOIN:some text
- Retrieves only the records that have matching values in both tables.
- Rows without matching values in either table are excluded from the result.

Example:

SELECT employees.name, departments.name
FROM employees
INNER JOIN departments
ON employees.department_id = departments.id;

In this example, only employees who have a matching department (i.e., department_id in employees matches id in departments) will be included in the result.
OUTER JOIN:some text
- Retrieves records that have matching values as well as unmatched records from one or both tables, depending on the type of OUTER JOIN.
- Types of OUTER JOIN:some text
  1. LEFT OUTER JOIN: Returns all rows from the left table, even if there are no matches in the right table.
  2. RIGHT OUTER JOIN: Returns all rows from the right table, even if there are no matches in the left table.
  3. FULL OUTER JOIN: Returns rows when there is a match in one of the tables. If there is no match, it returns NULL for non-matching rows.

Example of LEFT OUTER JOIN:

SELECT employees.name, departments.name
FROM employees
LEFT OUTER JOIN departments
ON employees.department_id = departments.id;

In this example, all employees are listed even if they do not belong to any department (in which case the department name will be NULL).

12. What is a correlated subquery? How is it different from a regular subquery?

‍A correlated subquery is a subquery that references columns from the outer query. It is executed once for every row selected by the outer query.

Example:

SELECT e.name, e.salary
FROM employees e
WHERE e.salary > (
    SELECT AVG(salary)
    FROM employees
    WHERE department_id = e.department_id
);

In this example, for each employee, the subquery calculates the average salary for that employee's department. This is a correlated subquery because the subquery depends on the outer query's department_id.
Regular subquery:some text
- A regular subquery is executed independently of the outer query. It is executed once, and its result is passed to the outer query.

Example:

SELECT name
FROM employees
WHERE salary > (SELECT AVG(salary) FROM employees);

Here, the subquery calculates the average salary once, and the outer query compares it to the salary of each employee.

Key differences:

A correlated subquery depends on the outer query for its value and is evaluated multiple times.
A regular subquery is independent and is evaluated once.

13. What is a SQL MERGE statement, and when would you use it?

‍The MERGE statement allows you to perform INSERT, UPDATE, or DELETE operations in a single SQL statement based on the matching condition between two tables. It’s often used in data warehousing and data integration scenarios where data from one table (source) needs to be merged into another table (target).

Syntax:

MERGE INTO target_table AS target
USING source_table AS source
ON target.id = source.id
WHEN MATCHED THEN
    UPDATE SET target.column = source.column
WHEN NOT MATCHED THEN
    INSERT (column1, column2)
    VALUES (source.column1, source.column2);

When to use it:

To synchronize two tables, for example, updating existing rows and inserting new ones from another table.
When you need to update existing records if they exist, and insert new ones if they don’t.

Example:

MERGE INTO employees AS target
USING new_employees AS source
ON target.employee_id = source.employee_id
WHEN MATCHED THEN
    UPDATE SET target.salary = source.salary
WHEN NOT MATCHED THEN
    INSERT (employee_id, name, salary)
    VALUES (source.employee_id, source.name, source.salary);

14. Explain SQL Server's WITH (NOLOCK) hint. What are its advantages and disadvantages?

The WITH (NOLOCK) hint allows a query to read uncommitted data, effectively bypassing locks on the table. This can improve query performance by reducing wait times for locks, but it comes with certain risks.

Advantages:

Improved performance: Since NOLOCK does not require waiting for locks, the query can execute faster.
No blocking: The query can run without being blocked by other transactions holding locks on the table.

Disadvantages:

Dirty reads: The query can read uncommitted (and potentially incorrect) data that may change later if a transaction is rolled back.
Inconsistent results: You may get partially updated data or rows that disappear mid-query.
Phantom reads: Rows that match the query criteria at the start may not exist when the transaction is committed.

Example:

SELECT * FROM employees WITH (NOLOCK);

This query will return data from the employees table, but it might include uncommitted changes that could be rolled back later.

15. How do you handle performance tuning for large SQL queries?

Performance tuning involves optimizing SQL queries to ensure they run efficiently, especially for large datasets. Some common techniques include:

Use Indexes: Ensure that appropriate indexes are created on columns used in WHERE, JOIN, and ORDER BY clauses.
Avoid SELECT *: Query only the necessary columns instead of selecting all columns, reducing the amount of data retrieved.
Use Proper Joins: Ensure the correct type of JOIN is used (e.g., INNER JOIN instead of OUTER JOIN if possible).
Use Subqueries Wisely: Replace correlated subqueries with JOINs where possible.
Use Query Plans: Analyze the query execution plan to identify bottlenecks such as full table scans, inefficient joins, or missing indexes.
Partitioning: For large tables, use partitioning to reduce the number of rows scanned by the query.
Caching: If data doesn’t change frequently, consider caching the results to avoid repeated query execution.

Example of using indexes:

CREATE INDEX idx_employee_department ON employees(department_id);

16. Explain how to handle concurrency issues in SQL.

Concurrency issues arise when multiple transactions access the same data simultaneously, leading to conflicts such as dirty reads, non-repeatable reads, or phantom reads. These issues can be mitigated using transaction isolation levels and locking mechanisms.

Transaction Isolation Levels:some text
- READ UNCOMMITTED: Allows dirty reads but has minimal locking.
- READ COMMITTED: Prevents dirty reads by ensuring only committed data is read.
- REPEATABLE READ: Prevents dirty and non-repeatable reads but can still allow phantom reads.
- SERIALIZABLE: The strictest level, preventing all concurrency issues, but can lead to higher resource consumption due to locks.
Locking:some text
- SQL uses locks (e.g., shared locks, exclusive locks) to manage concurrent access to data.
- Lock granularity can be at the row, page, or table level.

Example of setting the isolation level:

SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
BEGIN TRANSACTION;
UPDATE accounts SET balance = balance - 100 WHERE account_id = 1;

Using appropriate isolation levels based on your use case can help balance between data consistency and system performance.

17. What are SQL window functions? Provide examples.

Window functions perform calculations across a set of table rows related to the current row without collapsing them into a single result like aggregate functions. They are used for tasks like running totals, moving averages, and ranking.

Key types of window functions:

ROW_NUMBER(): Assigns a unique sequential integer to rows within a partition.

Example:

SELECT name, salary, ROW_NUMBER() OVER (ORDER BY salary DESC) AS rank
FROM employees;

RANK(): Similar to ROW_NUMBER(), but rows with equal values get the same rank, and gaps appear in the sequence.

Example:

SELECT name, salary, RANK() OVER (ORDER BY salary DESC) AS rank
FROM employees;

SUM(): Returns the running total of a column over a partition.

Example:

SELECT name, salary, SUM(salary) OVER (ORDER BY salary) AS running_total
FROM employees;

LEAD()/LAG(): Retrieves the value of a column from a subsequent or preceding row.

Example:

SELECT name, salary, LEAD(salary, 1) OVER (ORDER BY salary) AS next_salary
FROM employees;

18. How do you design a database schema for high availability?

‍Designing a database schema for high availability ensures that the database remains accessible and operational even during hardware or software failures. Common strategies include:

Replication: Set up data replication across multiple servers or locations to ensure data redundancy. Use master-slave or master-master replication depending on the write-read ratio.
Partitioning: Partition large tables to distribute data across multiple servers and optimize read/write performance.
Failover Clustering: Use database clusters where multiple servers act as backups to each other. In the event of a failure, another node can take over with minimal downtime.
Sharding: Split the database horizontally, distributing different subsets of data across different servers.
Backup and Restore: Regularly back up data and implement automated restore mechanisms to recover quickly in case of failure.
Load Balancing: Distribute read/write requests across multiple database nodes to avoid overloading a single node.

19. What is database normalization? Explain the different normal forms.

Normalization is the process of organizing data to reduce redundancy and improve data integrity. It involves breaking down tables into smaller tables and defining relationships between them.Normal Forms:

First Normal Form (1NF):some text
- Ensures that the table has atomic (indivisible) values and no repeating groups.

Example:

| OrderID | Product |
|---------|---------|
| 1       | A       |
| 1       | B       |

Second Normal Form (2NF):some text
- Builds on 1NF and ensures that all non-key attributes are fully dependent on the primary key.

Example:

| OrderID | ProductID | Quantity |
|---------|-----------|----------|

Third Normal Form (3NF):some text
- Builds on 2NF and ensures that no transitive dependencies exist (i.e., non-key attributes should not depend on other non-key attributes).

Example:

| CustomerID | CustomerName | OrderID |

20. What is the difference between GROUP BY and ORDER BY in SQL?

GROUP BY: Groups rows that have the same values in specified columns into summary rows (e.g., total sales per customer). It is used with aggregate functions like COUNT(), SUM(), AVG(), etc.

Example:

SELECT department, COUNT(*) 
FROM employees
GROUP BY department;

ORDER BY: Sorts the result set based on one or more columns. It does not perform any aggregation.

Example:

SELECT name, salary 
FROM employees
ORDER BY salary DESC;

21. Explain the CTE (Common Table Expressions) in SQL. How is it different from a subquery?

A Common Table Expression (CTE) is a temporary result set defined within the execution scope of a single SELECT, INSERT, UPDATE, or DELETE statement. CTEs make complex queries easier to read and manage by breaking them into simpler parts.

Syntax:

WITH cte_name AS (
    SELECT column1, column2
    FROM table_name
    WHERE condition
)
SELECT *
FROM cte_name;

Example:

WITH EmployeeCTE AS (
    SELECT department_id, AVG(salary) AS avg_salary
    FROM employees
    GROUP BY department_id
)
SELECT department_id, avg_salary
FROM EmployeeCTE;

Key differences between CTE and subquery:

Readability: CTEs improve readability for complex queries, while subqueries can become difficult to manage if nested.
Reusability: CTEs can be referenced multiple times within the main query, whereas subqueries are usually evaluated each time they are referenced.
Recursion: CTEs support recursive queries, which subqueries do not.

22. What is a recursive CTE? Provide an example.

A recursive CTE is a CTE that references itself. It is used to work with hierarchical data, such as organizational structures or tree structures.

Syntax:

WITH RECURSIVE cte_name AS (
    -- Anchor member: the base query
    SELECT column1, column2
    FROM table_name
    WHERE condition
  
    UNION ALL
  
    -- Recursive member: references the CTE itself
    SELECT column1, column2
    FROM table_name
    INNER JOIN cte_name
    ON table_name.column = cte_name.column
)
SELECT *
FROM cte_name;

Example: Retrieving an employee hierarchy (managers and subordinates):

WITH RECURSIVE EmployeeHierarchy AS (
    SELECT employee_id, name, manager_id
    FROM employees
    WHERE manager_id IS NULL  -- Anchor: Top-level manager
  
    UNION ALL
  
    SELECT e.employee_id, e.name, e.manager_id
    FROM employees e
    INNER JOIN EmployeeHierarchy eh
    ON e.manager_id = eh.employee_id  -- Recursion: Subordinates
)
SELECT *
FROM EmployeeHierarchy;

This recursive CTE will list all employees and their reporting hierarchy.

23. What are materialized views, and how are they different from regular views?

A materialized view is a physical copy of a result set, stored in a database as a separate object. Unlike regular views, which are virtual and always reflect the latest data from their base tables, materialized views store data persistently and must be manually refreshed.

Key differences:

Data Storage:some text
- Regular views do not store data; they only store the query and retrieve data dynamically when accessed.
- Materialized views store data physically, meaning the data is stored and can be queried without accessing the base tables.
Performance:some text
- Regular views query the base tables each time, which can be slow for complex queries.
- Materialized views improve performance by avoiding repeated computations, but the data may become stale if not refreshed.
Refresh:some text
- Regular views do not require refreshing.
- Materialized views require refreshing to stay up to date, either manually or automatically on a schedule.

Example of creating a materialized view:

CREATE MATERIALIZED VIEW sales_summary AS
SELECT product_id, SUM(sales_amount) AS total_sales
FROM sales
GROUP BY product_id;

24. Explain the concept of database indexing. What are the different types of indexes?

Indexing is a technique used to improve the speed of data retrieval operations on a database. An index creates a data structure (usually a B-tree or hash) that provides faster access to the rows in a table.

Types of indexes:

Clustered Index:some text
- Sorts and stores the actual data rows in the table based on the indexed column.
- There can only be one clustered index per table, as the data can be stored in only one order.

Example:

CREATE CLUSTERED INDEX idx_employee_id ON employees(employee_id);

Non-Clustered Index:some text
- Creates a separate structure to store pointers to the actual data rows. The table data itself is not stored in sorted order based on this index.
- A table can have multiple non-clustered indexes.

Example:

CREATE NONCLUSTERED INDEX idx_employee_name ON employees(name);

Unique Index:some text
- Ensures that the values in the indexed column(s) are unique, preventing duplicate values.

Example:

CREATE UNIQUE INDEX idx_unique_email ON employees(email);

Composite Index:some text
- An index on multiple columns.

Example:

CREATE INDEX idx_employee_department ON employees(department_id, salary);

Indexes improve query performance but can slow down INSERT, UPDATE, and DELETE operations because the index must also be updated.

‍25. What is database sharding ? When would you use it?

Database sharding is the process of splitting a large database into smaller, more manageable pieces (called shards), which are distributed across multiple servers. Each shard contains a subset of the total data, and together, they represent the entire dataset.When to use it:

When the database grows so large that a single server can no longer handle the load or storage requirements.
When you need to scale horizontally by distributing the load across multiple servers to improve performance.

Example of sharding by user ID:

Users with IDs between 1 and 1000 are stored on Server 1.
Users with IDs between 1001 and 2000 are stored on Server 2, etc.

Advantages:

Improved performance and scalability.
Allows distributed query execution.

Disadvantages:

Complexity in managing and maintaining shards.
Cross-shard queries can be slow.

26. How does SQL Server manage transaction logs, and why are they important?

‍In SQL Server, the transaction log records all changes made to the database. It ensures that each transaction is either fully completed (committed) or fully undone (rolled back) in case of failure, maintaining the integrity of the database.Key roles of the transaction log:

Recovery: The log allows SQL Server to recover from a crash by replaying committed transactions and rolling back uncommitted ones.
Backup and Restore: Transaction logs are used in point-in-time recovery scenarios.
Rollback: If a transaction is aborted, the log enables SQL Server to undo the changes.

Example of a full backup with transaction logs:

BACKUP DATABASE db_name TO DISK = 'db_backup.bak';
BACKUP LOG db_name TO DISK = 'db_log.trn';

The transaction log is crucial for data consistency and recovery but must be managed carefully to avoid growing too large.

27. Explain the difference between ROW_NUMBER(), RANK(), and DENSE_RANK() in SQL.

All three are window functions that assign a rank to rows within a partition, but they behave differently when encountering ties (rows with the same value).

ROW_NUMBER():some text
- Assigns a unique sequential number to each row, even if there are ties.

Example:

SELECT name, salary, ROW_NUMBER() OVER (ORDER BY salary DESC) AS row_num
FROM employees;

RANK():some text
- Assigns the same rank to rows with ties, but gaps appear in the ranking sequence.

Example:

SELECT name, salary, RANK() OVER (ORDER BY salary DESC) AS rank
FROM employees;

DENSE_RANK():some text
- Assigns the same rank to rows with ties, but no gaps appear in the ranking sequence.

Example:

SELECT name, salary, DENSE_RANK() OVER (ORDER BY salary DESC) AS dense_rank
FROM employees;

Differences:

ROW_NUMBER(): No ties, unique numbers.
RANK(): Ties have the same rank, but gaps occur in the sequence.
DENSE_RANK(): Ties have the same rank, and there are no gaps.

28. What is the difference between replication and mirroring in SQL?

Replication and mirroring are both techniques used to enhance database availability and reliability, but they serve different purposes and work in different ways. Replication

Purpose: Replication is primarily used for data distribution, load balancing, and ensuring data availability across multiple servers or locations.
How It Works: In replication, data from a primary database (or publisher) is copied to one or more secondary databases (or subscribers). This can be done asynchronously or synchronously, depending on the replication type.
Types of Replication:
‍Transactional Replication: Changes made to the primary database are captured and replicated to the secondary databases in real-time or near real-time.

Snapshot Replication: A snapshot of the primary database is taken at specific intervals and sent to secondary databases. This does not track changes in real-time.
Merge Replication: Allows updates at both the primary and secondary databases, merging changes when synchronization occurs.
Key Features:some text
- Replicated databases can be updated independently.
- It supports load balancing by directing read requests to secondary replicas.
- It’s useful for reporting and analytics, where data can be pulled from replicas.

Mirroring

Purpose: Database mirroring focuses on high availability and disaster recovery, ensuring that a backup database can quickly take over if the primary database fails.
How It Works: In mirroring, there are two databases (primary and mirror) that maintain a synchronized state. Any transaction committed to the primary database is also committed to the mirror database in real-time.
Types of Mirroring:
- High-Safety Mode (Synchronous): The primary and mirror databases must acknowledge transactions. This mode provides zero data loss but can increase latency.
- High-Performance Mode (Asynchronous): Transactions are sent to the mirror without waiting for acknowledgment, reducing latency but risking some data loss in case of a failure.
Key Features:
- Mirrored databases are kept in sync, ensuring a precise copy of the primary database.
- It provides automatic failover capabilities if set up with a witness server.
- Mirroring is generally more suitable for mission-critical applications requiring minimal downtime.

Summary of Differences:

Purpose: Replication is for data distribution; mirroring is for high availability.
Update Behavior: Replicated databases can be updated independently; mirrored databases must be synchronized.
Use Cases: Replication supports load balancing and analytics; mirroring focuses on failover and disaster recovery.

29. What is a deadlock in SQL, and how can it be resolved?

‍A deadlock occurs when two or more transactions block each other by holding locks on resources that the others need to complete. This situation leads to a standstill where each transaction is waiting for the other to release its lock.Example of a Deadlock:

Transaction A locks Table 1 and then tries to lock Table 2.
Transaction B locks Table 2 and then tries to lock Table 1.
Neither transaction can proceed because each is waiting for the other to release the lock.

Deadlock Detection and Resolution:

Deadlock Detection:
- Database management systems (DBMS) periodically check for deadlocks.
- When a deadlock is detected, the DBMS identifies the transactions involved.
Transaction Termination:
- The DBMS selects one of the transactions to be terminated (rolled back) to resolve the deadlock.
- The choice of which transaction to terminate can be based on factors such as the transaction's age, resources consumed, or priority.
Rollback:
- The chosen transaction is rolled back, releasing its locks, allowing the other transaction(s) to proceed.

Deadlock Prevention Strategies:

Consistent Locking Order:some text
- Ensure that all transactions acquire locks in a predetermined order. This prevents circular wait conditions that lead to deadlocks.
Timeouts:some text
- Implement timeouts for transactions. If a transaction cannot acquire all necessary locks within a specific timeframe, it is aborted.
Lower Isolation Levels:some text
- Use lower isolation levels (like READ COMMITTED) when possible, as they reduce locking contention and can help minimize the chances of deadlocks.
Avoiding Long Transactions:some text
- Keep transactions short and efficient, which reduces the time locks are held and the likelihood of deadlocks.
Split Transactions:some text
- Break larger transactions into smaller, manageable transactions where feasible, which can reduce the chances of deadlocks.

By understanding and implementing these strategies, you can mitigate the risk of deadlocks in SQL databases and ensure smoother transaction processing.

‍30. How do you implement security in SQL databases?

‍Implementing security in SQL databases involves multiple strategies and best practices to protect sensitive data from unauthorized access and breaches.Key Security Measures:

Authentication:
- Use strong authentication methods to verify user identities. This may include:some text
  - Windows Authentication: Integrates with Active Directory for user validation.
  - SQL Server Authentication: Requires username and password specific to SQL Server.
Authorization:
- Implement role-based access control (RBAC) to manage user permissions based on their roles rather than individual users.
- Apply the principle of least privilege, ensuring users have only the permissions necessary for their job functions.
Data Encryption:
- Encrypt sensitive data both at rest and in transit:
  - At Rest: Use Transparent Data Encryption (TDE) to encrypt database files.
  - In Transit: Use SSL/TLS to secure data transmitted between clients and the server.
Auditing:
- Enable database auditing to track access and modifications to sensitive data. This helps identify unauthorized access attempts and data changes.
- Use tools or built-in features to log user activities, such as logins, queries executed, and schema changes.
Regular Updates and Patching:
- Regularly update the database management system (DBMS) and apply security patches to address vulnerabilities.
Backup Security:
- Protect backup files by encrypting them and storing them securely. Regularly test backup and recovery processes to ensure they work as intended.
Firewalls and Network Security:some text
- Use firewalls to restrict unauthorized access to the database server.
- Employ network segmentation to isolate the database server from other systems.
Secure Configuration:some text
- Review and configure security settings to disable unnecessary features and services.
- Change default usernames and passwords for database accounts.
Monitoring and Intrusion Detection:some text
- Implement monitoring solutions to detect and alert on suspicious activities, such as unusual login patterns or excessive query execution.
Security Policies:some text
- Establish and enforce security policies regarding data access, usage, and sharing. Train employees on security best practices.

31. What is the purpose of the EXPLAIN command in SQL? How can it help optimize queries?

‍The EXPLAIN command provides insight into how SQL queries are executed by the database engine. It generates a query execution plan, detailing how tables are scanned, the order of operations, join methods, and the indexes used.

‍How it helps optimize queries:

Understanding Execution Plans: By analyzing the execution plan, you can identify bottlenecks in the query, such as full table scans, which can be costly.
Index Usage: It shows whether indexes are being used effectively, helping to identify potential areas for adding indexes.
Join Methods: Understanding the join methods used (e.g., nested loop, hash join) can guide optimizations, such as adjusting the order of joins or rethinking join conditions.
Estimating Costs: The plan often includes estimates of resource usage (CPU, I/O), helping to identify which operations are the most expensive.

Example of using EXPLAIN:

EXPLAIN SELECT name, salary FROM employees WHERE department_id = 5;

32. What are triggers in SQL? How do they differ from stored procedures?

Triggers are special types of stored procedures that automatically execute (or "fire") in response to specific events on a table or view, such as INSERT, UPDATE, or DELETE operations. They help maintain data integrity, enforce business rules, and automatically perform actions when data changes.

Key differences from stored procedures:

Execution:some text
- Triggers execute automatically when specified events occur.
- Stored procedures must be explicitly called by the application or user.
Usage:some text
- Triggers are typically used for data validation, auditing, and enforcing business rules.
- Stored procedures are used for complex operations and business logic that might involve multiple queries and logic.
Parameters:some text
- Triggers do not accept parameters.
- Stored procedures can accept parameters for input and output.

Example of a trigger:

CREATE TRIGGER trg_after_insert
AFTER INSERT ON employees
FOR EACH ROW
BEGIN
    INSERT INTO audit_table (employee_id, action)
    VALUES (NEW.employee_id, 'Inserted');
END;

33. What is a cursor in SQL? How is it used?

A cursor is a database object that allows you to retrieve and manipulate rows returned by a query one at a time. Cursors are useful when you need to process data row by row rather than all at once.

Types of Cursors:

Implicit Cursors: Automatically created by SQL when a SELECT statement is executed.
Explicit Cursors: Defined by the user for specific queries and provide more control over row processing.

Using a Cursor:

Declare the cursor with a SELECT statement.
Open the cursor to establish the result set.
Fetch rows from the cursor one at a time.
Close the cursor when done to release resources.
Deallocate the cursor to remove it from memory.

Example of using an explicit cursor:

DECLARE my_cursor CURSOR FOR
SELECT employee_id, name FROM employees;

OPEN my_cursor;

FETCH NEXT FROM my_cursor INTO @employee_id, @name;

WHILE @@FETCH_STATUS = 0
BEGIN
    -- Process each row here
    FETCH NEXT FROM my_cursor INTO @employee_id, @name;
END;

CLOSE my_cursor;
DEALLOCATE my_cursor;

34. Explain the concept of SQL injection. How can it be prevented?

SQL injection is a code injection technique that exploits vulnerabilities in an application by allowing an attacker to manipulate SQL queries. By inserting malicious SQL code into an input field, an attacker can gain unauthorized access to data, modify or delete records, or execute administrative operations.

Prevention methods:

Parameterized Queries: Use prepared statements with parameters instead of concatenating user inputs into SQL queries.

SELECT * FROM users WHERE username = ? AND password = ?;

Stored Procedures: Encapsulate SQL logic in stored procedures, making it harder for attackers to modify the SQL command.
Input Validation: Validate and sanitize user inputs, ensuring that only expected values are processed. Use whitelisting where possible.
Least Privilege Principle: Grant minimal permissions to database users. For example, do not use administrative accounts for web applications.
Web Application Firewalls (WAFs): Use a WAF to filter and monitor HTTP requests, providing an additional layer of security.
Regular Security Audits: Conduct regular security assessments and code reviews to identify and fix vulnerabilities.

35. What is the difference between a primary key and a foreign key?

Primary Key:some text
- A primary key is a unique identifier for each record in a table. It must contain unique values and cannot contain NULL values.
- Each table can have only one primary key.

Example:

CREATE TABLE employees (
    employee_id INT PRIMARY KEY,
    name VARCHAR(100),
    department_id INT
);

Foreign Key:some text
- A foreign key is a column or set of columns in one table that uniquely identifies a row in another table. It establishes a relationship between the two tables.
- A foreign key can contain NULL values unless otherwise specified.

Example:

CREATE TABLE departments (
    department_id INT PRIMARY KEY,
    department_name VARCHAR(100)
);

CREATE TABLE employees (
    employee_id INT PRIMARY KEY,
    name VARCHAR(100),
    department_id INT,
    FOREIGN KEY (department_id) REFERENCES departments(department_id)
);

Key Differences:

Uniqueness: A primary key must be unique; a foreign key can have duplicates.
Nullability: A primary key cannot contain NULL; a foreign key can, unless it is defined as NOT NULL.
Purpose: Primary keys uniquely identify records in their table, while foreign keys create relationships between tables.

36. Explain how to handle performance tuning in SQL. What are some common techniques?

Performance tuning in SQL involves optimizing database queries and schema to improve the speed and efficiency of database operations. Common techniques include:

Indexing:some text
- Create indexes on columns frequently used in WHERE, JOIN, and ORDER BY clauses to speed up data retrieval.
- Use the EXPLAIN command to analyze index usage.
Query Optimization:some text
- Rewrite complex queries to be more efficient. Use joins instead of subqueries when possible.
- Avoid SELECT * and retrieve only the necessary columns.
Database Normalization:some text
- Normalize the database to eliminate redundancy and improve data integrity.
- In some cases, denormalization may be beneficial for read-heavy applications.
Use of Cached Results:some text
- Use caching techniques to store the results of frequently executed queries, reducing database load.
Partitioning:some text
- Partition large tables to improve performance and manageability, especially for read-heavy workloads.
Connection Pooling:some text
- Implement connection pooling to reduce the overhead of establishing connections to the database.
Monitoring and Profiling:some text
- Regularly monitor query performance and database metrics using profiling tools. Identify slow-running queries and optimize them.
Hardware Resources:some text
- Ensure the database server has adequate hardware resources (CPU, memory, disk I/O) to handle the workload efficiently.

37. What are the differences between INNER JOIN, LEFT JOIN, and RIGHT JOIN?

Joins are used to combine rows from two or more tables based on a related column. The key types of joins include:

INNER JOIN:some text
- Returns only the rows with matching values in both tables. If there is no match, those rows are excluded from the result.
- Example:

SELECT employees.name, departments.department_name
FROM employees
INNER JOIN departments ON employees.department_id = departments.department_id;

LEFT JOIN (or LEFT OUTER JOIN):some text
- Returns all rows from the left table and matched rows from the right table. If there is no match, NULL values are returned for columns from the right table.
- Example:

SELECT employees.name, departments.department_name
FROM employees
LEFT JOIN departments ON employees.department_id = departments.department_id;

RIGHT JOIN (or RIGHT OUTER JOIN):some text
- Returns all rows from the right table and matched rows from the left table. If there is no match, NULL values are returned for columns from the left table.
- Example:

SELECT employees.name, departments.department_name
FROM employees
RIGHT JOIN departments ON employees.department_id = departments.department_id;

Summary:

INNER JOIN: Only matching rows from both tables.
LEFT JOIN: All rows from the left table, matched from the right.
RIGHT JOIN: All rows from the right table, matched from the left.

38. What are SQL constraints? Explain the different types of constraints.

Constraints are rules enforced on the data in a table to maintain integrity and consistency. They restrict the types of data that can be entered into a table. Common types of SQL constraints include:

NOT NULL:some text
- Ensures that a column cannot contain NULL values.
- Example:

CREATE TABLE employees (
    employee_id INT NOT NULL,
    name VARCHAR(100) NOT NULL
);

UNIQUE:some text
- Ensures that all values in a column are unique. It allows NULL values, but there can only be one NULL.
- Example:

CREATE TABLE users (
    user_id INT UNIQUE,
    email VARCHAR(255) UNIQUE
);

PRIMARY KEY:some text
- A combination of NOT NULL and UNIQUE, it uniquely identifies each row in a table. Each table can have only one primary key.
- Example:

CREATE TABLE customers (
    customer_id INT PRIMARY KEY,
    name VARCHAR(100)
);

FOREIGN KEY:some text
- A key used to link two tables together. It creates a relationship between the foreign key in one table and the primary key in another.
- Example:

CREATE TABLE orders (
    order_id INT PRIMARY KEY,
    customer_id INT,
    FOREIGN KEY (customer_id) REFERENCES customers(customer_id)
);

CHECK:some text
- Ensures that all values in a column satisfy a specific condition.
- Example:

CREATE TABLE products (
    product_id INT PRIMARY KEY,
    price DECIMAL CHECK (price >= 0)
);

DEFAULT:some text
- Assigns a default value to a column when no value is specified.
- Example:

CREATE TABLE employees (
    employee_id INT PRIMARY KEY,
    status VARCHAR(10) DEFAULT 'active'
);

39. What is the difference between UNION and UNION ALL?

Both UNION and UNION ALL are used to combine the results of two or more SELECT statements, but they differ in how they handle duplicates:

UNION:some text
- Combines the results of two or more SELECT statements and removes duplicate rows from the result set.
- Example:

SELECT city FROM customers
UNION
SELECT city FROM suppliers;

UNION ALL:some text
- Combines the results of two or more SELECT statements but includes all rows, including duplicates.
- Example:

SELECT city FROM customers
UNION ALL
SELECT city FROM suppliers;

Key Differences:

Duplicates: UNION removes duplicates; UNION ALL retains all rows.
Performance: UNION ALL is generally faster because it doesn’t require the additional step of removing duplicates.

40. What are window functions in SQL, and how are they used?

Window functions are a type of function that performs calculations across a set of table rows that are related to the current row. Unlike regular aggregate functions, window functions do not collapse the result set into a single row; instead, they return a value for each row.

Commonly used window functions:

ROW_NUMBER(): Assigns a unique sequential integer to rows within a partition.

SELECT name, salary,
       ROW_NUMBER() OVER (ORDER BY salary DESC) AS rank
FROM employees;

RANK(): Similar to ROW_NUMBER(), but it gives the same rank to identical values, skipping the subsequent ranks.

SELECT name, salary,
       RANK() OVER (ORDER BY salary DESC) AS rank
FROM employees;

DENSE_RANK(): Similar to RANK(), but does not skip ranks for ties.

SELECT name, salary,
       DENSE_RANK() OVER (ORDER BY salary DESC) AS dense_rank
FROM employees;

SUM(), AVG(), etc.: You can perform aggregate calculations over a specific range of rows.

SELECT name, salary,
       SUM(salary) OVER (PARTITION BY department_id) AS department_total
FROM employees;

How to Use Window Functions:

Use the OVER() clause to define the window or set of rows to which the function will be applied.
You can partition the result set with PARTITION BY and order it with ORDER BY.

Window functions are powerful tools for analytics and reporting, enabling more complex queries without the need for subqueries or joins.

‍

WeCP Team

Team @WeCP

WeCP is a leading talent assessment platform that helps companies streamline their recruitment and L&D process by evaluating candidates' skills through tailored assessments

Check out these other Interview Questions...

Interviews, tips, guides, industry best practices, and news.

Spring Boot Interview Questions and Answers

Economics Interview Questions and Answers

Communication Interview Questions and Answers

GCP Interview Questions and Answers

System Engineer Interview Questions and Answers

Network Administrator Interview Questions and Answers

Power Apps Interview Questions and Answers

Quant Interview Questions and Answers

Rest API Interview Questions and Answers

View all posts