Wednesday, September 28, 2011

SQL Tutorial Basics For Beginners

SQL Basic Tutorial For Beginning Coders

To begin to understand SQL one must first define what it is. Structured Query Language (SQL) is the computer language that is used to communicate with databases as well as access and manage the information the database contains. Fully understanding the concept of SQL as well as the languages mechanics and structure will help you easily build web applications or create personalized applications using the information contained in databases. The SQL language is great for website work. Several database applications are available free to developers and make designing commands that fit your individual needs easy to implement. 



While this article is not meant to be a comprehensive guide to SQL it will give a broad overview of the types of things that can be accomplished using SQL. Although SQL is commonly recognized in two separate forms, Data Manipulation Language (DML) and Data Definition Language (DDL), this introduction will cover the DML, used primarily for updates and queries.

SQL is considered a set-based language. The benefits of using a set-based language such as SQL over a procedural language is that we are able to define exactly what data you want to modify. With a procedural language, the database processes determine how to collect the data and apply your queries. SQL is typically much shorter and easier to code. It allows us to do an update or summary with one line of syntax that would be several lines of procedural language. SQL is also faster for the processor because any looping is done on a database server operating on an extremely low level. 

Let's look at SELECT queries first as they are the most common SQL statement. A SELECT query basically tells SQL to go and "find" the data we wish to view. As with any computer language, it is important to be very specific as to what data is requested for retrieval and we have to tell SQL exactly how we want it to search the database records. It is possible to have SQL retrieve information based on a single value or on multiple values. If multiple values are chosen, then the columns have to be separated with a comma. However, if we wish to select "all columns", then we accomplish this by using an asterisk (*) as a wild card. A simple example of this would be:

SELECT columns FROM my_data;
or
SELECT * FROM my_data;

The first statement defines information to be retrieved, where using the wildcard will pull every detail recorded for each table in the database. For example, if our database contained a client base, using a wildcard would pull every detail from every client in the database. For large databases this could keep us busy for quite a long time! 

For larger databases using conditions is convenient way to filter the information and stops an overload of information. By using conditions we can reduce the amount of information retrieved and make it more specific to our search criteria. This effectively restricts the amount of information provided with the query and streamlines the data so that only specific information is retrieved. To accomplish this specific conditions are included in the query statement and separated by commas. These conditions should list criteria the data must meet in order to be recognized and retrieved by the SQL query. An example of a query with conditions would be: 

SELECT client_email FROM client;

This form tells the database to provide client emails only from the specific table labeled client. If, for example, we only wanted to pull email information for clients named "Smith" we can restrict the conditions further as:

SELECT client_name FROM client WHERE client_name > 'Smith';

Setting Conditions On SQL Queries

A condition specifies a value that is true, false, or unknown about any row or group within the database. Results are given based on logical operators AND, OR, NOT to the result of each specification. If the operators are not specified, the result defaults back to a specified predicate. Parentheses are not supported in any condition. 

SQL queries used on existing databases retrieve needed information from the database. As stated, using a wildcard will pull all of the information contained in the database table. If we want to restrict the data generated by our query, we have to set more restrictions on the query. Let's imagine that our database is named "products" and contains the columns name, color, size and price. Using a broad query such as:

SELECT * FROM products;

This query would return all of the information on all of our products listed in the database. If however, we wanted to pull only products that are colored blue, a further condition must be set and the query would be written as follows:

SELECT * FROM products WHERE color = 'blue';

This tells the database to only pull information on products that are blue. You can make your request even more specific by adding further conditions. Let's say a client is asking about blue t-shirts in size Large. Your query would be coded as:

SELECT * FROM products

WHERE color = 'blue'

AND size = 'large';

To pull information using conditions it is important to remember that all conditions must be met. If there were no blue t-shirts in listed in the inventory database in size "Large", no record would be retrieved at all. It is possible to make a condition optional by interchanging "and" with "or". Thus the query would be written as:

SELECT * FROM products

WHERE color = 'blue'

OR size = 'large';

A query written this way would pull all of the blue shirts in addition to all of the large shirts.

Using Multi-Table Queries

Another way of using the INSERT statement is by using a sub-query. Also known as an inner or nested query, a sub-query is typically added in the WHERE clause. It is used most commonly when we understand how to search for a value with a SELECT statement, but are not sure of the exact value itself. Sub-queries are a great additional way of retrieving data from multiple tables at once. They are used most commonly with comparison operators like =, <, >, >=, <= etc.

This syntax can be used when we need to insert multiple rows in one single statement.

INSERT INTO my_table1 (col1,col2,...col) SELECT col1,col2,...colZ FROM my_table2 WHERE conditions;

The only limitation here is the number of rows meeting the conditions criteria of the WHERE clause of the sub query. We can leave out the names of columns where values are given. These columns do need to be named in the sub query or a wildcard needs to be used. This syntax is used often when creating a new tablet with the columns desired by specifying a condition that cannot be met or is never true, such as "1=2." Doing this we cause the table to be created but it is not populated with information.

It is important to understand sub-queries before diving into multi-part questions. When sub queries are embedded in other SQL statements and allow answers to multi-part questions. As an example "give me the names of all tshirts in all categories that are priced $10 or more." This query can be answered by being broken down into two parts: 

1. "Which tshirts are priced over $10."
2. "Which tshirts are in the athletic category."

This query is coded into SQL as:

SELECT tshirt_vneck, type FROM product WHERE category = athletic;

SELECT tshirt_vneck FROM product WHERE price > 10.00

This type of query is written with the first question queried last, but it is read from the bottom up. In other words, the sub-query is given priority. It is possible to nest sub-queries to answer more and more detailed questions. It is rare however, to ever nest a query more than three levels deep due to performance issues. Also, it is harder for administrators to manage this type of nested sub-query because it becomes harder to understand the deeper they are nested.

Another option is to change the sub-query into a join. These are tricky however and also may cause performance issues. Many times these queries are completed simply by trying what works, and finding out what doesn't. To change a sub-query into a join we have to nest the conditions in the inner WHERE statement to the outer WHERE clause under the table FROM clause. We then have to specify the conditions of the join between the tables.

Here is another version of the first example reworked as a join:

SELECT tshirt_vneck FROM product WHERE category = athletic 

( SELECT tshirt_vneck FROM product WHERE price > 10 );

We can change this to:

SELECT tshirt_vneck FROM product, colors

WHERE budget > 10

AND tshirt_vneck.category = athletic.red;

Note that in the join condition we had to specify the table names because category is a column in both tables. If we didn't do this, Oracle would send an error message because it would not be able to understand which category we were referring to. A three part question simply has more tables added to the join, or another sub-query. 

Modifying Information In The Database

It becomes necessary at times to update or modify the information contained your database. This task is one of the most basic and common undertakings using SQL. Therefore, it is important that you understand SQL in order to perform these tasks properly and efficiently. Although an update is a modification of information in the database, queries are important because they are often used to insert and delete statements.

Let's think about your t-shirt database. You have just received a new load of shirts and your database needs to be updated showing new information for a line of athletic shirts. Also, a sleeveless version of a shirt that has been discontinued needs to be replaced by a newer version with a cap sleeve. To update, delete or insert information in a database, the queries would be written as:

UPDATE shirt_vneck SET col1 = val1, col2 = val2,... colz = valz WHERE type = athletic; 
DELETE FROM shirt_vneck WHERE type = sleeveless; 

The insert statement is a little more complicated but would be written as:

INSERT INTO shirt_vneck (name, color, size and price) 
VALUES (capsleeve, red, large, 10);

This statement would need to be repeated for each item being added into the inventory database. It is very important in the Insert statement to provide a value for each column in the table as Oracle will automatically assume all columns are being added. It is possible to only give values for one or two columns, but in that case we have to be specific as to which ones are receiving values. Also, if the columns are specified we can change the order. If not specified, they have to be in the exact same order as the original table.

If the WHERE syntax is not included in update and delete statements these statements will affect all of the table rows. Keep in mind that Oracle is not capable of letting us know if our command is valid. Let's assume we are raising the price on one specific t-shirt to 10% higher than it was. The syntax would be written as:

UPDATE shirt_vneck WHERE capsleeve SET price = price*1.1;

Rather than using a wildcard you will need to specify only one type of shirt. Otherwise the prices will be increased across the board.

To delete specific items from the table, you need to be very specific so that only what you wish to be deleted is removed. Hence, this statement would delete your entire inventory:

DELETE FROM products;

To delete only one type of shirt, all red shirts for example, from the database use a statement similar to this:

DELETE tshirt_vneck FROM employees WHERE color=red;

Again, if no rows match the conditions provided by the WHERE clause, no updates are made to the database.

The customization of SQL allows for manipulation of information and how it is retrieved from the database. Understanding the rules that govern SQL and being able to implement them will help you make your database perform at it's peak.

No comments:

Post a Comment