Whether you are a job seeker preparing for a new position where you can apply your SQL skills, or a hiring manager interviewing a candidate for a role in your company, it is important to know the common questions and answers from SQL job interviews. The ability to understand and write effective SQL is a cornerstone of almost every technical role that deals with data, from data analysis and data science to back-end engineering and data engineering. A candidate’s fluency in SQL often reveals their logical thinking, attention to detail, and ability to work with data structures.
SQL interview questions at a glance
In this article, we will look at various SQL interview questions and answers for people who are just starting out or already have some experience. For beginners, we will cover general questions about experience, the SQL dialects they know, and foundational knowledge. This section also covers the basics of SQL, such as what it is, what it is used for, common SQL statements, SQL commands, and different types of SQL queries. For advanced practitioners in later parts, we will cover topics like functions, joins, primary and foreign keys, indexes, and SQL relationships. We will also touch on database design concepts like normalization, denormalization, and the differences between various SQL statements.
General SQL interview questions
Before you are asked technical questions, your conversation partner may ask you a few general questions about your general experience with SQL. Don’t worry if you do not have much SQL experience yet; your interviewer probably already knows this from your resume. Since they want to talk to you anyway, they have likely considered your profile a good fit for their company. Furthermore, it is perfectly fine if you have only worked with one SQL dialect so far. Remember that all SQL dialects are quite similar. Therefore, knowing just one is a good foundation for learning the others.
1. What is SQL?
SQL stands for Structured Query Language. It is a standardized programming language used for managing and interacting with relational database management systems, or RDBMS. Its primary purpose is to allow users to communicate with a database to perform various tasks. This includes retrieving specific data, updating or inserting new data, and deleting data from database tables. It is also used to define and modify the structure of the database itself, such as creating or dropping tables and indexes. Because of its power and relative simplicity, it is a fundamental tool for data analysts, database administrators, developers, and data scientists.
2. What are SQL dialects? Give a few examples.
While SQL is a standard, most relational database management systems have their own proprietary implementation or version of the language. These different versions are known as SQL dialects. All variants of SQL share a very similar core syntax for common commands like SELECT, INSERT, UPDATE, and DELETE, making the skills highly transferable. However, they often differ in their additional features, such as how they handle NULL values, specific function names, data types, and procedural extensions. Some common examples of dialects include T-SQL, which is used by a major enterprise database system, and PL/SQL, which is used by another large commercial database provider. Other widely used dialects are found in open-source systems like PostgreSQL and MySQL, as well as lightweight, file-based systems like SQLite.
3. What are the main applications of SQL?
SQL is the standard language for managing relational databases, so its applications are vast. At its core, we can use SQL to create, delete, and update tables and define the entire database structure. Its most common application is to access, edit, and modify the data stored within those tables. For analytical purposes, SQL is used to extract specific, required information from one or more tables, allowing users to filter, sort, and summarize complex datasets to answer business questions. It also allows for granular control over the data, such as adding or removing specific rows or columns from a table, and managing permissions for who can access or change the data.
SQL interview questions for beginners
Your conversation partner could start the conversation with simpler questions to verify your foundational understanding. These questions are designed to ensure you have a grasp of the basic terminology and the logical structure of a query before moving on to more complex topics.
4. What is an SQL statement?
An SQL statement, also known as an SQL command, is a string of text that the database engine interprets as a valid command and executes accordingly. A statement is a complete instruction sent to the database. These statements are the building blocks of all interactions with the database. A few examples of common SQL statements include SELECT, INSERT, UPDATE, DELETE, CREATE TABLE, DROP TABLE, ALTER TABLE, GRANT, and REVOKE. A statement can be as simple as a single query to retrieve data or a more complex command to modify the database’s structure.
5. What is an SQL query?
An SQL query is a specific type of SQL statement used to request data or information from a database. While “statement” refers to any valid command, “query” almost always refers to a statement that retrieves data, primarily using the SELECT command. There are two main types of SQL queries: queries for retrieving data and queries for modifying data. Data retrieval queries are used to fetch the required data, which can include complex operations like limiting, grouping, sorting, and extracting from multiple tables. Data modification queries, while technically DML statements, are sometimes referred to as queries and are used to create, add, delete, update, and rename data.
6. What is an SQL subquery?
An SQL subquery, also called an inner query or nested query, is a complete query that is placed within another, outer query. A subquery is used to return data that will be used by the main query as a condition to further restrict the data to be retrieved. A subquery can appear in various clauses, such as SELECT, FROM, WHERE, and UPDATE. It is also possible to have a subquery within another subquery, a practice known as nesting. When a query is executed, the innermost subquery is executed first. Its result is then passed to the parent query that contains it, which then uses this result to execute.
7. What is an SQL join?
An SQL join is a clause used in a query to combine and retrieve records from two or more tables in a database. SQL tables are often related to each other using common columns, known as keys. A join operation allows you to merge these related tables based on the relationship between their columns, such as a common ID field. This is a fundamental concept in relational databases, as it allows for the normalization of data—storing data in separate, logical tables to reduce redundancy—while still being able to retrieve a complete, combined dataset when needed.
8. What is an SQL comment?
An SQL comment is a human-readable explanation or note included within a piece of SQL code. These comments are ignored by the SQL engine when the code is executed; their sole purpose is to make the code more understandable for people who may be reading or maintaining it later. SQL provides two ways to add comments. Single-line comments are typically started with a double hyphen (–), and the engine will ignore everything on that line after the double hyphen. Multi-line comments are used for longer explanations and are enclosed between /* at the beginning and */ at the end.
9. What is an SQL alias?
An SQL alias is a temporary, alternative name assigned to a table or a column within a table for the duration of a specific SQL query. Aliases are primarily used to make code more readable and concise, especially in complex queries. For example, if you have a table named quarterly_sales_data_final, you could assign it the alias sales in your FROM clause. Aliases are also essential when you join a table to itself or when you use aggregate functions and want to give the resulting calculated column a clear, descriptive name. An alias is typically introduced with the AS keyword, though in many dialects the AS is optional for table aliases.
Technical SQL interview questions
Now let’s take a look at the technical SQL interview questions and some possible answers to them. When answering technical questions, it is best to be as precise as possible. Vague answers might seem like an attempt to stray from the main topic. Furthermore, ambiguity could raise even more questions that you might not feel as confident in answering.
10. What types of SQL commands do you know?
SQL commands are broadly categorized into five types based on their function. Data Definition Language (DDL) is used to define, create, and modify the structure of database objects like tables and indexes. Data Manipulation Language (DML) is used for retrieving, editing, and modifying the data within the tables. Data Control Language (DCL) is used to regulate user access to data in the database, such as granting or revoking permissions. Transaction Control Language (TCL) is used for managing transactions, or groups of SQL statements, to ensure they are completed successfully. Finally, Data Query Language (DQL) is used to query the data; in modern SQL, this is often grouped with DML but is primarily represented by the SELECT statement.
11. Give a few examples of common SQL commands.
For Data Definition Language (DDL), common examples are CREATE TABLE to build a new table, ALTER TABLE to modify an existing table’s structure, DROP TABLE to delete a table entirely, TRUNCATE to delete all data from a table, and ADD COLUMN. For Data Manipulation Language (DML), the main commands are INSERT to add new rows, UPDATE to modify existing rows, and DELETE to remove rows. For Data Control Language (DCL), the two primary commands are GRANT to give permissions and REVOKE to remove them. For Transaction Control Language (TCL), examples include COMMIT to save a transaction, ROLLBACK to undo it, and SAVEPOINT. Finally, the primary Data Query Language (DQL) command is SELECT.
12. What is a DBMS and what types of DBMS do you know?
DBMS stands for Database Management System. It is a comprehensive software package that allows users and applications to interact with a database. It provides the core functionality for managing data, including accessing, updating, editing, inserting, and deleting it, while also handling tasks like security, backups, and concurrency control. There are different types of database management systems, which are categorized based on how they organize, structure, and store data. The main types include relational, hierarchical, network, graph, and object-oriented. Each model is suited for different types of data and applications.
13. What is RDBMS? Give a few examples of RDBMS.
RDBMS stands for Relational Database Management System. This is currently the most common type of DBMS used in the industry. An RDBMS is designed to work with data stored in a structured, tabular format—that is, in tables composed of rows and columns. The “relational” part comes from the fact that these tables can be linked, or related, to one another using common keys. This model is highly efficient for storing and retrieving structured data while minimizing redundancy. The SQL programming language is specifically designed to interact with RDBMS. Some well-known examples of RDBMS include open-source systems like PostgreSQL and MySQL, commercial enterprise systems, and file-based systems like SQLite.
14. What are tables and fields in SQL?
In a relational database, a table is the primary object used to store an organized collection of data. It is structured in a tabular format, consisting of rows and columns. Each row represents a single, complete record or entity, such as a specific customer or a single product. A field is another term for a column in a table. Each column represents a specific attribute or piece of information for the records, such as the “LastName” of a customer or the “Price” of a product. Every field is defined to hold a specific data type, like text, a number, or a date, ensuring that all data in that column is consistent.
15. What types of SQL subqueries do you know?
SQL subqueries can be categorized in several ways. One way is by the data they return: a single-line subquery returns at most one row and one column, often used with comparison operators. A multiple-line subquery returns at least two rows, and is often used with operators like IN or EXISTS. A multiple-column subquery returns at least two columns and is often used in a FROM clause as a derived table. Another way to categorize them is by their relationship with the outer query: a nested subquery is a self-contained query that executes once and passes its result to the outer query. In contrast, a correlated subquery is a subquery that is related to the information from the outer query and must be executed once for each row processed by the outer query.
16. What is a constraint and why is it used?
A constraint is a set of rules or conditions that are applied to a column or a table to define what type of data can be entered and how the data relates to other data. The primary purpose of constraints is to enforce data integrity, accuracy, and reliability at the database level. They ensure that the data in a table remains accurate and consistent by preventing unwanted or invalid actions, such as inserting a duplicate ID or a null value into a required field. By defining these rules directly on the database structure, constraints provide a robust way to maintain the quality of the data, regardless of the application or user that is trying to modify it.
17. What SQL constraints are you aware of?
There are several types of SQL constraints used to enforce data integrity. The NOT NULL constraint ensures that a column cannot have a NULL or empty value. The UNIQUE constraint ensures that all values in a column (or a set of columns) are different from one another. The PRIMARY KEY constraint is a combination of NOT NULL and UNIQUE, and it uniquely identifies each row in a table. The FOREIGN KEY constraint is used to link two tables together, ensuring that a value in a column of one table matches a value in the PRIMARY KEY of another table. The DEFAULT constraint specifies a default value for a column if no value is provided during an insert. Finally, the CHECK constraint ensures that the values in a column satisfy a specific condition.
18. What types of joins do you know?
The main types of joins are INNER JOIN and the OUTER joins. An INNER JOIN is the standard join and returns only the records that satisfy the specified join condition in both tables. The OUTER joins include LEFT OUTER JOIN, RIGHT OUTER JOIN, and FULL OUTER JOIN. A LEFT JOIN returns all records from the left table and only the records from the right table that satisfy the join condition. A RIGHT JOIN does the opposite, returning all records from the right table and only the matching records from the left. A FULL JOIN returns all records from both tables, combining the results of both a left and right join. Additionally, there is a CROSS JOIN, which returns the Cartesian product of both tables.
19. What is a primary key in SQL?
A primary key is a column, or a set of columns, in a table for which the PRIMARY KEY constraint has been set. This constraint enforces two critical rules: all values in that column must be unique, and no value in that column can be NULL. In essence, a primary key is a combination of the NOT NULL and UNIQUE constraints. Its fundamental purpose is to uniquely identify each individual record, or row, in the table. Because it provides this unique identifier, any table can have at most one primary key. While a primary key is not strictly required by all database engines, it is a fundamental concept of relational database design and is highly recommended for all tables.
20. What is a unique key in SQL?
A unique key is a column, or a set of columns, in a table to which the UNIQUE constraint has been applied. This constraint ensures that all values in that column are distinct from one another. Its purpose is to prevent duplicate entries for that specific attribute. Unlike a primary key, a table can have multiple unique keys. Another key difference is that the UNIQUE constraint will typically allow one, and in some database systems, multiple NULL values. The handling of NULL values is a key differentiator between database dialects; some systems treat NULL as a value and allow only one, while others treat NULL as an unknown and allow many.
21. What is a foreign key in SQL?
A foreign key is a column, or a set of columns, in a table to which the FOREIGN KEY constraint has been applied. This constraint creates a link between two tables by ensuring that the values in the foreign key column of one table (the “child” table) must match the values in the PRIMARY KEY or a UNIQUE key column of another table (the “parent” table). The purpose of foreign keys is to enforce referential integrity, which means that the relationship between tables is always valid. For example, a foreign key can prevent you from adding an order to an “Orders” table for a customer that does not exist in the “Customers” table.
22. What is an SQL index?
An SQL index is a special data structure that is associated with a table, and it is one of the most important tools for improving query performance. An index is used to store important parts of the table, typically one or more columns, in a pre-sorted order. This allows the database engine to speed up searching and retrieving data, much like how an index in the back of a book helps you find a topic without reading the entire book. Instead of performing a “full table scan” (reading every row), the engine can use the index to find the location of the required data quickly. Indexes are especially useful for large databases, where they can significantly improve query performance.
23. What types of indices do you know?
The two main types of indexes are clustered and non-clustered. A table can only have one clustered index because it defines the physical order of how the records are stored on disk. The database searches for data using the key values of the clustered index, which is often the primary key. A non-clustered index, on the other hand, maintains a logical order of entries that does not correspond to the physical order of the data on the hard drive. The data is stored in one location (the “heap” or clustered index), and the non-clustered index is a separate structure that points to the data. A table can have multiple non-clustered indexes. Another type is a unique index, which enforces the UNIQUE constraint and prevents duplicate entries in a column.
24. What is a schema?
In SQL, a database schema is a collection of logical structural elements, such as tables, views, stored procedures, indexes, functions, and triggers. A schema is essentially a container or namespace that organizes these objects within a database. It illustrates a part of the database architecture and defines the relationships between different objects. Schemas are also a key part of database security, as they allow administrators to establish different access permissions for different objects. For example, you can grant a user permission to read from all tables in a “Sales” schema but no access to the “HR” schema.
25. What is an SQL operator?
An SQL operator is a reserved character, combination of characters, or a keyword used in an SQL query to perform a specific operation. These operations can be mathematical calculations, comparisons between values, logical combinations of conditions, or string pattern matching. SQL operators are most frequently used within the WHERE clause of a query to specify a condition, or multiple conditions, for filtering data to be returned. For example, the = operator is used to test for equality, and the AND operator is used to combine two conditions, both of which must be true.
26. What types of SQL operators do you know?
SQL operators can be grouped into several categories. Arithmetic operators perform mathematical calculations, such as + (addition), – (subtraction), * (multiplication), and / (division). Comparison operators are used to compare two values, such as > (greater than), < (less than), = (equal to), and >= (greater than or equal to). Logical operators are used to combine multiple conditions, such as AND, OR, and NOT, as well as BETWEEN and IN. String operators are used for pattern matching, primarily LIKE along with its wildcards % and _. Finally, set operators are used to combine the results of two SELECT statements, and include UNION, UNION ALL, INTERSECT, and EXCEPT (or MINUS in some dialects).
27. What is a clause?
In SQL, a clause is a component of a query that begins with a specific keyword and performs a specific function. Clauses are the building blocks of a SELECT statement, and each one is used to filter, sort, group, or otherwise define the data being retrieved. For example, the WHERE clause is used to filter the data based on a specific condition. The ORDER BY clause is used to sort the result set. Other common clauses include FROM, LIMIT, HAVING, LIKE, AND, and OR. Each clause provides a specific instruction to the database engine to refine the result set.
28. What are some common statements used with the SELECT query?
The SELECT query is used to retrieve data, and it is constructed using several common clauses that are processed in a logical order. The most common and essential clauses are FROM, which specifies the table or tables to retrieve data from, and WHERE, which filters the rows based on a specific condition. After the data is filtered, the GROUP BY clause can be used to aggregate rows that have the same values into summary rows. The HAVING clause is then used to filter these aggregated groups. Finally, the ORDER BY clause is used to sort the final result set in ascending or descending order, and the LIMIT clause can be used to restrict the number of rows returned.
29. How do you create a table in SQL?
You create a table in SQL using the CREATE TABLE statement, which is a Data Definition Language (DDL) command. This statement defines the name of the new table and, just as importantly, the names and data types for each of its columns. The data type specifies what kind of data the column can hold, such as an integer, a text string, or a date. You can also define constraints for each column at the time of creation, such as NOT NULL or UNIQUE. For example, to create a simple table named “Employees” with three columns for an ID, a name, and a hire date, the basic syntax would be: CREATE TABLE Employees (EmployeeID INT, LastName VARCHAR(255), HireDate DATE);.
30. How do you update a table?
You update existing data in a table using the UPDATE statement, which is a Data Manipulation Language (DML) command. This statement allows you to modify the values of one or more columns in one or more rows. The UPDATE statement is always used with the SET clause, which specifies which column to update and what its new value should be. It is almost always used with a WHERE clause to specify which row or rows should be modified. If you omit the WHERE clause, the UPDATE statement will modify the specified columns for all rows in the table, which can be a dangerous and irreversible action. The basic syntax is: UPDATE table_name SET column1 = value1 WHERE condition;.
31. How do you delete a table from a database?
You delete an entire table from a database using the DROP TABLE statement, which is a Data Definition Language (DDL) command. This command completely removes the table’s definition, all of the data stored within it, and any associated indexes, triggers, and constraints. This is a permanent and irreversible action, so it must be used with extreme caution. Once a table is dropped, the data cannot be recovered unless you have a backup of the database. The syntax is very simple: DROP TABLE table_name;. This command should not be confused with DELETE or TRUNCATE, which only remove data from a table but leave the table structure itself intact.
32. How can you find out the number of records in a table?
You can find out the total number of records, or rows, in a table by using the COUNT() aggregate function. This function is used within a SELECT query. The most common way to use it for this purpose is to pass an asterisk as its argument, which tells the function to count all rows. The syntax is: SELECT COUNT(*) FROM table_name;. This query will return a single row with a single column containing the total number of rows in the specified table. This is a very fast and efficient operation, as the database engine typically keeps metadata about the number of rows in each table.
33. How do you sort records in a table?
You sort the records returned by a query using the ORDER BY clause. This clause is added to the end of a SELECT statement and specifies which column or columns to use for sorting. By default, the ORDER BY clause sorts the data in ascending order. You can explicitly specify ascending order with the ASC keyword. To sort the data in descending order, you must use the DESC keyword. You can also sort by more than one column. The query will first sort by the first column listed, and then, for any rows that have the same value in the first column, it will sort them based on the second column, and so on.
34. How do you select all columns from a table?
You select all columns from a table using the asterisk (*) wildcard character in your SELECT statement. The asterisk is a shortcut that stands for “all columns.” This is a quick and easy way to see all the data in a table, especially during exploration or debugging. The syntax is: SELECT * FROM table_name;. While this is very convenient, it is generally considered bad practice to use SELECT * in production code. This is because it can be inefficient, retrieving more data than necessary, and it can cause code to break in the future if a column is added to or removed from the table. In production, it is always better to explicitly list the names of the columns you need.
35. How do you select common data sets from two tables?
You select only the common data sets, or rows, that exist in the results of two different queries by using the INTERSECT set operator. This operator takes the result sets of two SELECT statements and returns only the rows that are identical and present in both result sets. For this operator to work, the SELECT statements must have the same number of columns, and the data types of the corresponding columns must be compatible with each other. The INTERSECT operator also automatically removes duplicate rows from the final result set. The syntax is: SELECT col1, col2 FROM table_1 INTERSECT SELECT col1, col2 FROM table_2;.
36. What is the DISTINCT statement and how is it used?
The DISTINCT keyword is used in a SELECT statement to filter out duplicate rows and return only the unique, or distinct, values from a column or a combination of columns. When you query a column, it will return all values, including duplicates. For example, if you query a “City” column, you may get “London” listed multiple times. By adding the DISTINCT keyword right after SELECT, you tell the database to return each unique city only once. The syntax is: SELECT DISTINCT column_name FROM table_name;. This is very useful for analytical queries where you want to find out the unique set of values present in a field.
37. What are relationships? Give a few examples.
In a relational database, relationships are the connections and logical associations between tables. They are the foundation of the “relational” model. These relationships are defined using primary and foreign keys, which link tables based on common data fields. For example, you can find the same customer ID in a “Customers” table (where it is the primary key) and in an “Orders” table (where it is a foreign key). This links a specific customer to all of their orders. The main types of relationships are one-to-one (one customer has one profile), one-to-many (one customer can have many orders), and many-to-many (one product can be in many orders, and one order can contain many products).
38. What is a NULL value? What is the difference between it and zero or a blank space?
A NULL value represents the absence of data. It indicates that no value exists for a particular cell in a table; the value is unknown or missing. This is fundamentally different from a zero or a blank space. A zero (0) is a valid numeric value that represents the number zero. A blank space or an empty string (”) is a valid character value that represents a string of zero length. NULL is not a value; it is a marker for missing information. This distinction is critical in queries, as NULL values are not equal to anything, not even other NULL values. You must use special operators like IS NULL or IS NOT NULL to check for them.
39. What is the difference between SQL and NoSQL?
The primary difference lies in their data model and structure. SQL databases are relational (RDBMS) and use a structured schema. This means the data is organized into tables with predefined columns and data types, and relationships between tables are strictly enforced. This model is excellent for structured data and ensures high data integrity. NoSQL databases, on the other hand, are non-relational and typically schema-less or have a dynamic schema. They are designed to process unstructured or semi-structured data and can use various data models, such as document (key-value pairs), key-value, wide-column, or graph. NoSQL databases are often preferred for their high scalability and flexibility, especially in large-scale web applications.
40. What are some typical problems when working with SQL databases?
There are several common challenges when working with SQL databases. Performance optimization for large datasets is a constant concern; this includes writing efficient queries, managing indexing strategies correctly, and analyzing query execution plans to find bottlenecks. Ensuring data integrity with constraints is crucial but can add overhead. Handling concurrency, where multiple users are trying to read and write data at the same time, is another major challenge that requires careful transaction management to prevent conflicts and data corruption. Other typical problems include database design (normalization vs. denormalization), planning for backups and disaster recovery, and managing security and user access permissions.
41. What is a function in SQL?
A function in SQL is a reusable, pre-compiled database object that represents a series of SQL statements designed to perform a specific task. A function is called by its name, takes some input parameters (or no parameters at all), performs calculations or other operations with those inputs, and then returns a single value or, in some cases, a table. Functions are used to encapsulate complex or frequently used logic, which makes code more readable, modular, and maintainable. By using functions, you can avoid repeatedly writing the same code snippets, leading to more efficient and reliable code.
42. What types of SQL functions do you know?
SQL functions can be broadly categorized into two main types: aggregate functions and scalar functions. Aggregate functions operate on a set of rows, usually a group of data, and return a single value for that entire group. For example, the SUM() function calculates the total of all values in a column. Scalar functions, on the other hand, operate on a single, individual value and return a single value in response. For example, the UPPER() function takes a single string value and returns it in uppercase. Beyond this, functions can also be categorized as either built-in (predefined by the database system) or user-defined (created by the user to meet their specific business logic needs).
43. Which SQL aggregate functions do you know?
Aggregate functions are used to perform a calculation on a set of rows and return a single, summary value. The most common aggregate functions are COUNT(), which returns the number of rows (it counts non-null values if a column name is specified, or all rows if * is used). SUM() returns the total sum of all values in a numeric column. AVG() returns the average value of a numeric column. MIN() returns the smallest value in a column. And MAX() returns the highest value in a column. These functions are almost always used in conjunction with the GROUP BY clause to calculate summary values for different subgroups of data.
44. Which SQL scalar functions do you know?
Scalar functions operate on a single value and return a single value. There are many types, but some of the most common include string functions, numeric functions, and date functions. For example, UCASE() (or UPPER() in many dialects) returns a string converted to uppercase. LCASE() (or LOWER()) returns a string converted to lowercase. LEN() (or LENGTH()) returns the length of a string. MID() (or SUBSTR()) extracts a substring from a string. For numeric values, ROUND() returns a numerical value rounded to a specific number of decimal places. For dates, NOW() (or GETDATE() in some dialects) returns the current date and time. INITCAP() is also a scalar function that capitalizes the first letter of each word.
45. What are case handling functions in SQL?
Case handling functions are a specific category of scalar functions used to change the letter casing of text data. These functions are crucial for data cleaning and standardization, as they allow you to ensure that data is in a consistent format for comparison or presentation. The three primary case handling functions are UCASE() or UPPER(), which converts an entire string to uppercase. LCASE() or LOWER(), which converts an entire string to lowercase, is very useful for case-insensitive comparisons. Finally, INITCAP() is a function available in some dialects that returns a string in “title case,” where the first letter of each word is capitalized and the rest are lowercase.
46. What are character manipulation functions in SQL?
Character manipulation functions are a subset of scalar functions used to modify, combine, or extract parts of text data. CONCAT() is used to join two or more string values together. SUBSTRING() or SUBSTR() is used to extract a portion of a string by specifying a start position and length. LENGTH() or LEN() returns the number of characters in a string. REPLACE() is used to replace all occurrences of a specific substring with another substring. INSTR() returns the numerical position of a substring within a string. LPAD() and RPAD() are used to pad a string to a certain length with specified characters on the left or right side. TRIM() is used to remove leading, trailing, or both spaces or other specified characters from a string.
47. What is the difference between local and global variables?
The difference between local and global variables relates to their scope and lifetime, primarily within procedural SQL extensions. A local variable is declared within a specific function, stored procedure, or batch of code. It can only be accessed and used within that specific block of code where it was declared. Its value is lost once the function or procedure finishes executing. A global variable, on the other hand, is declared outside of any specific function and is stored in a fixed memory structure. This means it can be accessed and used by any function, procedure, or program throughout the entire database session or even system-wide, depending on the dialect. Global variables persist for the duration of the session.
48. What is the difference between SQL and PL/SQL?
SQL is the standard, declarative language for querying and managing relational databases. It is primarily used for data manipulation and retrieval, focusing on what data you want, not how to get it. PL/SQL, which stands for Procedural Language/SQL, is a procedural extension of SQL used specifically in Oracle databases. It incorporates procedural programming constructs that standard SQL lacks, such as IF-THEN-ELSE conditions, loops, exception handling, and declared variables. This allows developers to write complex business logic, stored procedures, and triggers that can be executed directly within the database, combining the power of SQL data access with the flexibility of a procedural programming language. Other dialects have similar procedural extensions.
49. What is the difference between LEFT JOIN and LEFT OUTER JOIN?
There is absolutely no functional difference between LEFT JOIN and LEFT OUTER JOIN. The keyword OUTER in the LEFT OUTER JOIN syntax is optional and purely for readability. Both LEFT JOIN and LEFT OUTER JOIN are interchangeable and will produce the exact same query execution plan and result set. They both return all records from the left table and only the matching records from the right table. If no match is found in the right table, the columns from the right table will be filled with NULL values. Most developers omit the OUTER keyword and simply use LEFT JOIN as it is more concise. The same logic applies to RIGHT JOIN and FULL JOIN.
50. What is indexing in SQL and how does it improve performance?
Indexing is the process of creating a special data structure that is associated with a table to speed up data retrieval. This structure functions like an optimized, pre-sorted lookup table for the values in the indexed columns. It allows the database to find rows more efficiently by giving it a direct path to the data, much like a book’s index. This reduces the need for full table scans, where the database would have to read every single row in the table to find the requested data. However, there is a trade-off. While indexes significantly speed up SELECT queries, they can slow down INSERT, UPDATE, and DELETE operations because the indexes themselves must also be updated.
51. What is a stored procedure and how does it differ from a function?
A stored procedure is a pre-compiled set of one or more SQL statements that are saved in the database and can be executed as a single unit to perform a task. Procedures are highly flexible; they can modify data, manage transactions, and return zero, one, or multiple result sets. Functions, on the other hand, are typically used within SQL expressions to calculate a value. The main difference is that a function must return a value (either a single scalar value or a table) and, in many databases, is restricted from having side effects like modifying data in tables. A procedure is called on its own, while a function is called as part of another SQL statement.
52. How is the data sorted by default using the ORDER BY statement, and how can this be changed?
By default, the ORDER BY statement sorts the data in ascending order. This can be explicitly stated using the ASC keyword. The handling of NULL values in a sort varies by database dialect; some systems place NULL values first in an ascending sort, while others place them last. To change the sort order to descending, you must use the DESC keyword after the column name. This can be specified for each column individually in the ORDER BY clause, allowing for complex sorting, such as sorting by one column in descending order and another in ascending order. The syntax is: SELECT * FROM table_name ORDER BY col_1 DESC, col_2 ASC;.
53. What are SQL set operators?
SQL set operators are used to combine the result sets of two or more SELECT statements into a single result set. The main set operators are UNION, UNION ALL, INTERSECT, and EXCEPT. UNION returns all unique records found by either the first or the second query, effectively stacking them and removing duplicates. UNION ALL does the same but includes all duplicate entries. INTERSECT returns only the records that are found in both queries. EXCEPT, which is called MINUS in some database dialects, returns only the records found by the first query that are not present in the second query. All set operators require that the SELECT statements have the same number of columns with compatible data types.
54. Which operator is used in the pattern recognition query?
The operator used for pattern recognition, or pattern matching, in SQL is the LIKE operator. This operator is used within a WHERE clause to search for a specified pattern in a text column. The LIKE operator uses two wildcard characters to define the pattern. The percent sign (%) is a wildcard that represents any number of characters, including zero characters. For example, ‘A%’ would find any string that starts with the letter ‘A’. The underscore (_) is a wildcard that represents exactly one character. For example, ‘H_T’ would find strings like “HOT” or “HAT” but not “HT” or “HEAT”.
55. What is the difference between a primary key and a unique key in SQL?
Both PRIMARY KEY and UNIQUE key constraints ensure that the values in a column or set of columns are unique. However, there are two key differences. First, a table can have only one primary key, but it can have multiple unique keys. The primary key’s main purpose is to uniquely identify each record in the table. Second, a primary key constraint enforces NOT NULL by default; its columns cannot contain any NULL values. A unique key constraint, on the other hand, does allow NULL values. The handling of these NULLs varies by database system; some allow only one NULL, while others allow multiple.
56. What is a composite primary key in SQL?
A composite primary key is a primary key that is based on the combination of two or more columns in a table. This is used when a single column is not sufficient to uniquely identify each row. For example, in a “OrderDetails” table that links orders and products, neither the “OrderID” nor the “ProductID” alone would be unique, as an order can have multiple products and a product can be in multiple orders. However, the combination of “OrderID” and “ProductID” is unique for each row. By creating a composite primary key on both columns, the database ensures that no duplicate pairs can be inserted, thus maintaining the integrity of the table.
57. What is the typical order of SQL clauses in a SELECT statement?
When writing a SELECT statement, the SQL standard dictates a specific order for the main clauses. You must write the clauses in this syntactic order for the query to be valid. The typical order is: SELECT, followed by FROM, then JOIN with its ON condition, then WHERE, then GROUP BY, then HAVING, then ORDER BY, and finally, a LIMIT or OFFSET clause. While not all clauses are required for every query, they must appear in this sequence when they are used. For example, you cannot place a WHERE clause after a GROUP BY clause. This order ensures the query is unambiguous and can be correctly parsed by the database engine.
58. In what order does the interpreter execute the usual instructions in the SELECT query?
While the written order of a SELECT statement is fixed, the logical execution order is quite different. This is one of the most important concepts to understand for debugging and optimization. The database engine first executes the FROM, JOIN, and ON clauses to assemble the working set of data from all specified tables. Next, it applies the WHERE clause to filter out individual rows that do not meet the specified conditions. After this, it executes the GROUP BY clause to aggregate the remaining rows into groups. Then, the HAVING clause is applied to filter out entire groups that do not meet its conditions. Only then is the SELECT clause processed to choose the final columns. Finally, the ORDER BY clause is executed to sort the results, and the LIMIT or OFFSET clause is applied to pick the final rows.
59. What is a view in SQL?
A view is a virtual table that is based on the result set of a SELECT query. A standard view does not store any data itself; instead, it is a saved query that is executed dynamically whenever the view is accessed. Views are used for several reasons: they can simplify complex queries by encapsulating the logic of joins and aggregations into a simple, reusable object. They can be used for security to restrict data access, for example, by creating a view that only shows certain columns or rows from a table to a specific user. They can also be used to consolidate and merge data from multiple tables into a single virtual table, providing a unified logical view of the data.
60. Can we create one view based on another view in SQL?
Yes, it is possible to create a view that is based on another existing view. This is known as creating a nested view. For example, you could have a base view that joins two large tables. You could then create a second, simpler view that queries the first base view but adds a WHERE clause to filter the results for a specific department. While this is technically possible and can sometimes be useful for layering logic, it is generally a practice to be used with caution. Nesting multiple views on top of each other can make the code very difficult to read, debug, and maintain. It can also lead to significant performance issues, as the database engine may have to expand and execute multiple complex queries just to satisfy a simple request.
61. Can we still use a view if the original table has been deleted?
No. If the original table or tables that a standard view is based on have been deleted using the DROP TABLE command, the view will become invalid and can no longer be used. A standard view is just a saved query definition; it does not store a copy of the data. When you try to query the view, the database engine attempts to execute its underlying SELECT statement, which references the now-nonexistent table. This will result in an error, typically stating that the base object does not exist. The view itself must be explicitly dropped and recreated if a new table with the same name and structure is created.
62. What types of SQL relationships do you know?
There are three main types of relationships that can exist between tables in a relational database. The first is a one-to-one relationship, where one row in a table is linked to at most one row in another table. This is often used to split a table for security or performance, such as separating a main “Employee” table from a “EmployeeSalary” table. The second and most common is a one-to-many relationship, where one row in a “parent” table can be linked to many rows in a “child” table. A classic example is one customer having many orders. The third is a many-to-many relationship, where many rows in one table can be linked to many rows in another. This requires a third, intermediate “join” table, such as a “ProductOrders” table linking a “Products” table and an “Orders” table.
63. What are the possible values of a BOOLEAN data field?
A BOOLEAN data field is a data type that can store one of two possible values, typically representing logical states. These values are TRUE and FALSE. In addition to these two states, a BOOLEAN field can also store a NULL value, which represents an “unknown” or “missing” state. This three-value logic (TRUE, FALSE, NULL) is a key concept in SQL. It is important to note that not all database systems have a dedicated BOOLEAN data type. Some systems emulate it using a small integer type, such as a TINYINT or BIT, where the value 1 is used to represent TRUE and 0 is used to represent FALSE.
64. What is normalization in SQL?
Normalization is the process of designing and structuring a relational database to minimize data redundancy and improve data integrity. This is achieved by organizing data into multiple, smaller, related tables and defining clear relationships between them using keys. The goal is to ensure that each piece of data is stored in only one place. For example, instead of storing a customer’s name and address in every single order they place, normalization dictates that you store customer information in a “Customers” table and only store the “CustomerID” in the “Orders” table. This reduces storage space and, more importantly, makes data updates easier and less error-prone. If a customer’s address changes, you only need to update it in one place.
65. What is denormalization in SQL?
Denormalization is the process of intentionally adding redundant data to a database to improve query performance. It is the opposite of normalization. While a highly normalized database is efficient for writing and updating data, it can be slow for reading data, as queries may need to join many tables together to get a complete answer. Denormalization optimizes for read performance by adding pre-joined or pre-calculated data back into tables. For example, in an e-commerce database, you might add the “ProductName” to the “OrderDetails” table, even though it already exists in the “Products” table. This makes queries on the “OrderDetails” table faster because they no longer need to join with the “Products” table just to get the name. It is a strategic trade-off, sacrificing write efficiency for read speed.
66. What is the difference between renaming a column and assigning an alias?
The difference lies in their permanence and scope. Renaming a column is a permanent, structural change to the database, which is done using the ALTER TABLE Data Definition Language (DDL) command. When you rename a column, you are changing the actual name of that column in the table’s definition. This change will be permanent and visible to all users and all future queries. An alias, on the other hand, is a temporary, alternative name given to a column only for the duration of a single query. It is assigned in the SELECT clause using the AS keyword. The alias does not change the column’s real name in the database; it only changes how that column is labeled in the final result set of that one query.
67. What is the difference between nested and correlated subqueries?
The primary difference is in their execution and dependency on the outer query. A nested subquery is a self-contained, independent query. It is executed once at the very beginning, and its result set is then used by the outer query as a static value or list. The outer query depends on the inner query, but the inner query does not depend on the outer query. A correlated subquery, however, is dependent on the outer query. It references one or more columns from the outer query, so it must be executed repeatedly, once for each individual row that is being processed by the outer query. This makes correlated subqueries much less efficient and slower than nested subqueries in most cases.
68. What is the difference between clustered and non-clustered indexes?
The main difference is in how they store data. A clustered index defines the physical sort order of the data in the table. The rows of the table are stored on disk in the same order as the clustered index. Because of this, a table can only have one clustered index. A primary key is often implemented as a clustered index by default. A non-clustered index is a separate data structure that is stored apart from the table’s data. It contains the indexed column values in sorted order, and each index entry has a pointer or “row locator” that points to the actual data row. Because they are stored separately, a table can have multiple non-clustered indexes.
69. What is the CASE() function?
The CASE statement is a powerful control-flow expression that allows you to implement if-then-else logic directly within your SQL query. It is not technically a function in all dialects but an expression. It allows you to return different values based on a set of conditions. A CASE expression starts with the CASE keyword, is followed by one or more WHEN … THEN … clauses that define a condition and a result, and can optionally include an ELSE clause to provide a default value if none of the WHEN conditions are met. It must always end with the END keyword. It is incredibly useful for creating new, conditional columns, categorizing data, or handling different scenarios within a single query.
70. What is the difference between the DELETE and TRUNCATE statements?
Both DELETE and TRUNCATE are used to remove data from a table, but they work very differently. DELETE is a Data Manipulation Language (DML) command. It removes rows one by one and logs each deletion in the transaction log. Because it is logged, a DELETE operation can be rolled back. You can also use a WHERE clause with DELETE to remove only specific rows. TRUNCATE is a Data Definition Language (DDL) command. It removes all data by deallocating the data pages, which is much faster and uses fewer system resources. However, it removes all rows and cannot be used with a WHERE clause. Because it is minimally logged, it typically cannot be rolled back in the same way as a DELETE.
71. What is the difference between the DROP and TRUNCATE statements?
Both DROP and TRUNCATE are Data Definition Language (DDL) commands and are very destructive, but they operate on different objects. TRUNCATE removes all rows from a table, but the table structure itself (the columns, data types, and constraints) remains intact. It is like emptying a box completely but keeping the box. DROP, on the other hand, removes the entire object from the database. The DROP TABLE command deletes the table’s structure, all of its data, and all associated indexes, triggers, and constraints. It is like taking the box and burning it. A DROP operation is permanent and irreversible without a backup.
72. What is the difference between the HAVING and WHERE statements?
This is a classic SQL interview question. The WHERE clause and the HAVING clause are both used to filter data, but they operate at different stages of a query. The WHERE clause is used to filter individual rows before any grouping or aggregation occurs. It is processed early in the query execution. The HAVING clause is used to filter groups of rows after the GROUP BY clause has been applied and aggregate functions have been calculated. You must use HAVING if you want to filter based on the result of an aggregate function, such as SUM(Sales) > 1000. You cannot use an aggregate function in a WHERE clause.
73. How do you add a record to a table?
You add a new record, or row, to a table using the INSERT INTO statement, which is a Data Manipulation Language (DML) command. There are two common ways to write this statement. The first method is to specify the columns you are inserting data into, followed by the VALUES keyword and a list of the values. This is the recommended method as it is explicit and will not break if the table structure changes. The syntax is: INSERT INTO table_name (column1, column2) VALUES (value1, value2);. The second method omits the column names, but you must then provide a value for every column in the table, in the exact order they are defined in the table’s structure.
74. How do you delete a record from a table?
You delete one or more records from a table using the DELETE FROM statement, a Data Manipulation Language (DML) command. This statement is almost always used with a WHERE clause to specify the exact condition that identifies the row or rows to be removed. The syntax is: DELETE FROM table_name WHERE condition;. It is critically important to be careful with the WHERE clause, as a missing or incorrect condition can lead to the deletion of the wrong data. If you execute the DELETE FROM statement without any WHERE clause at all, it will delete every single row from the table, but the table structure itself will remain.
75. How do you add a column to a table?
You add a new column to an existing table using the ALTER TABLE statement, which is a Data Definition Language (DDL) command. This command is used to modify the structure of a table. To add a column, you use the ADD COLUMN clause, followed by the name of the new column and its data type. You can also specify constraints for the new column at the same time, such as a DEFAULT value or a NOT NULL constraint (if a default is also provided). The basic syntax is: ALTER TABLE table_name ADD COLUMN new_column_name datatype;. This operation can be resource-intensive on very large tables, as the database may need to rewrite the table’s data blocks.
76. How do you change the name of a column in a table?
You change the name of an existing column in a table using the ALTER TABLE statement, but the exact syntax can vary significantly between different SQL dialects. In some systems, the command is ALTER TABLE table_name RENAME COLUMN old_name TO new_name;. In other systems, the syntax might be ALTER TABLE table_name CHANGE old_name new_name new_datatype;, which also requires you to respecify the data type. Because this syntax is not standardized, it is one of the commands you often need to look up for the specific database system you are working with. Renaming a column is a structural change and can break applications or queries that reference the old column name.
Conclusion
This series has covered a comprehensive range of 84 SQL questions and answers, from foundational concepts to advanced practical scenarios. For job seekers, this guide should serve as a robust framework for your interview preparation, helping you build confidence and identify areas for review. For hiring managers and recruiters, this list provides a structured set of questions to effectively gauge a candidate’s SQL proficiency at all levels. Remember that knowing the definitions is only half the battle; the true measure of a SQL expert is the ability to combine these concepts to solve complex problems and write efficient, reliable, and readable code.