Functions: Molding the Data You Retrieve

ToChapter we talk about functions. Functions in SQL enable you to perform feats such as determining the sum of a column or converting all the characters of a string to uppercase. By the end of the day, you will understand and be able to use all the following:

- Aggregate functions
- Date and time functions
- Arithmetic functions
- Character functions
- Conversion functions
- Miscellaneous functions

These functions greatly increase your ability to manipulate the information you
retrieved using the basic functions of SQL that were described earlier this week.
The first five aggregate functions, `COUNT`, `SUM`, `AVG`, `MAX`,
and `MIN`, are defined in the ANSI standard. Most implementations of SQL have
extensions to these aggregate functions, some of which are covered today. Some implementations
may use different names for these functions.

These functions are also referred to as group functions. They return a value based
on the values in a column. (After all, you wouldn't ask for the average of a single
field.) The examples in this section use the table `TEAMSTATS`:

SQL>SELECT * FROM TEAMSTATS;

NAME POS AB HITS WALKS SINGLES DOUBLES TRIPLES HR SO --------- --- --- ---- ----- ------- ------- ------- -- -- JONES 1B 145 45 34 31 8 1 5 10 DONKNOW 3B 175 65 23 50 10 1 4 15 WORLEY LF 157 49 15 35 8 3 3 16 DAVID OF 187 70 24 48 4 0 17 42 HAMHOCKER 3B 50 12 10 10 2 0 0 13 CASEY DH 1 0 0 0 0 0 0 1 6 rows selected.

The function `COUNT` returns the number of rows that satisfy the condition
in the `WHERE` clause. Say you wanted to know how many ball players were hitting
under 350. You would type

SQL>SELECT COUNT(*)2FROM TEAMSTATS3WHERE HITS/AB < .35;COUNT(*) -------- 4

To make the code more readable, try an alias:

SQL>SELECT COUNT(*) NUM_BELOW_3502FROM TEAMSTATS3WHERE HITS/AB < .35;NUM_BELOW_350 ------------- 4

Would it make any difference if you tried a column name instead of the asterisk? (Notice the use of parentheses around the column names.) Try this:

SQL>SELECT COUNT(NAME) NUM_BELOW_3502FROM TEAMSTATS3WHERE HITS/AB < .35;NUM_BELOW_350 ------------- 4

The answer is no. The `NAME` column that you selected was not involved
in the `WHERE` statement. If you use `COUNT` without a `WHERE`
clause, it returns the number of records in the table.

SQL>SELECT COUNT(*)2FROM TEAMSTATS;COUNT(*) --------- 6

`SUM` does just that. It returns the sum of all values in a column. To
find out how many singles have been hit, type

SQL>SELECT SUM(SINGLES) TOTAL_SINGLES2FROM TEAMSTATS;

TOTAL_SINGLES ------------- 174

To get several sums, use

SQL>SELECT SUM(SINGLES) TOTAL_SINGLES, SUM(DOUBLES) TOTAL_DOUBLES, SUM(TRIPLES) TOTAL_TRIPLES, SUM(HR) TOTAL_HR2FROM TEAMSTATS;TOTAL_SINGLES TOTAL_DOUBLES TOTAL_TRIPLES TOTAL_HR ------------- ------------- ------------- -------- 174 32 5 29

To collect similar information on all 300 or better players, type

SQL>SELECT SUM(SINGLES) TOTAL_SINGLES, SUM(DOUBLES) TOTAL_DOUBLES, SUM(TRIPLES) TOTAL_TRIPLES, SUM(HR) TOTAL_HR2FROM TEAMSTATS3WHERE HITS/AB >= .300;TOTAL_SINGLES TOTAL_DOUBLES TOTAL_TRIPLES TOTAL_HR ------------- ------------- ------------- -------- 164 30 5 29

To compute a team batting average, type

SQL>SELECT SUM(HITS)/SUM(AB) TEAM_AVERAGE2FROM TEAMSTATS;TEAM_AVERAGE ------------ .33706294

`SUM` works only with numbers. If you try it on a nonnumerical field, you
get

SQL>SELECT SUM(NAME)2FROM TEAMSTATS;ERROR: ORA-01722: invalid number no rows selected

This error message is logical because you cannot sum a group of names.

The `AVG` function computes the average of a column. To find the average
number of strike outs, use this:

SQL>SELECT AVG(SO) AVE_STRIKE_OUTS2FROM TEAMSTATS;

AVE_STRIKE_OUTS --------------- 16.166667

The following example illustrates the difference between `SUM` and `AVG`:

SQL>SELECT AVG(HITS/AB) TEAM_AVERAGE2FROM TEAMSTATS;TEAM_AVERAGE ------------ .26803448

The team was batting over 300 in the previous example! What happened? `AVG`
computed the average of the combined column hits divided by at bats, whereas the
example with `SUM` divided the total number of hits by the number of at bats.
For example, player A gets 50 hits in 100 at bats for a .500 average. Player B gets
0 hits in 1 at bat for a 0.0 average. The average of 0.0 and 0.5 is .250. If you
compute the combined average of 50 hits in 101 at bats, the answer is a respectable
.495. The following statement returns the correct batting average:

SQL>SELECT AVG(HITS)/AVG(AB) TEAM_AVERAGE2FROM TEAMSTATS;

TEAM_AVERAGE ------------ .33706294

Like the `SUM` function, `AVG` works only with numbers.

If you want to find the largest value in a column, use `MAX`. For example,
what is the highest number of hits?

SQL>SELECT MAX(HITS)2FROM TEAMSTATS;

MAX(HITS) --------- 70

Can you find out who has the most hits?

SQL>SELECT NAME2FROM TEAMSTATS3WHERE HITS = MAX(HITS);ERROR at line 3: ORA-00934: group function is not allowed here

Unfortunately, you can't. The error message is a reminder that this group function
(remember that *aggregate functions* are also called *group functions*)
does not work in the `WHERE` clause. Don't despair, Chapter 7, "Subqueries:
The Embedded `SELECT` Statement," covers the concept of subqueries and
explains a way to find who has the `MAX` hits.

What happens if you try a nonnumerical column?

SQL>SELECT MAX(NAME)2FROM TEAMSTATS;

MAX(NAME) --------------- WORLEY

Here's something new. `MAX` returns the highest (closest to Z) string.
Finally, a function that works with both characters and numbers.

`MIN` does the expected thing and works like `MAX` except it returns
the lowest member of a column. To find out the fewest at bats, type

SQL>SELECT MIN(AB)2FROM TEAMSTATS;

MIN(AB) --------- 1

The following statement returns the name closest to the beginning of the alphabet:

SQL>SELECT MIN(NAME)2FROM TEAMSTATS;MIN(NAME) --------------- CASEY

You can combine `MIN` with `MAX` to give a range of values. For
example:

SQL>SELECT MIN(AB), MAX(AB)2FROM TEAMSTATS;MIN(AB) MAX(AB) -------- -------- 1 187

This sort of information can be useful when using statistical functions.

NOTE:As we mentioned in the introduction, the first five aggregate functions are described in the ANSI standard. The remaining aggregate functions have become de facto standards, present in all important implementations of SQL. We use the Oracle7 names for these functions. Other implementations may use different names.

`VARIANCE` produces the square of the standard deviation, a number vital
to many statistical calculations. It works like this:

SQL>SELECT VARIANCE(HITS)2FROM TEAMSTATS;

VARIANCE(HITS) -------------- 802.96667

If you try a string

SQL>SELECT VARIANCE(NAME)2FROM TEAMSTATS;ERROR: ORA-01722: invalid number no rows selected

you find that `VARIANCE` is another function that works exclusively with
numbers.

The final group function, `STDDEV`, finds the standard deviation of a column
of numbers, as demonstrated by this example:

SQL>SELECT STDDEV(HITS)2FROM TEAMSTATS;

STDDEV(HITS) ------------ 28.336666

It also returns an error when confronted by a string:

SQL>SELECT STDDEV(NAME)2FROM TEAMSTATS;ERROR: ORA-01722: invalid number no rows selected

These aggregate functions can also be used in various combinations:

SQL>SELECT COUNT(AB),2AVG(AB),3MIN(AB),4MAX(AB),5STDDEV(AB),6VARIANCE(AB),7SUM(AB)8FROM TEAMSTATS;COUNT(AB) AVG(AB) MIN(AB) MAX(AB) STDDEV(AB) VARIANCE(AB) SUM(AB) --------- ------- ------- ------- ---------- ------------ ------- 6 119.167 1 187 75.589 5712.97 715

The next time you hear a sportscaster use statistics to fill the time between plays, you will know that SQL is at work somewhere behind the scenes.

We live in a civilization governed by times and dates, and most major implementations
of SQL have functions to cope with these concepts. This section uses the table `PROJECT`
to demonstrate the time and date functions.

SQL>SELECT * FROM PROJECT;

TASK STARTDATE ENDDATE -------------- --------- --------- KICKOFF MTG 01-APR-95 01-APR-95 TECH SURVEY 02-APR-95 01-MAY-95 USER MTGS 15-MAY-95 30-MAY-95 DESIGN WIDGET 01-JUN-95 30-JUN-95 CODE WIDGET 01-JUL-95 02-SEP-95 TESTING 03-SEP-95 17-JAN-96

6 rows selected.

NOTE:This table used the Date data type. Most implementations of SQL have a Date data type, but the exact syntax may vary.

This function adds a number of months to a specified date. For example, say something extraordinary happened, and the preceding project slipped to the right by two months. You could make a new schedule by typing

SQL>SELECT TASK,2STARTDATE,3ENDDATE ORIGINAL_END,4ADD_MONTHS(ENDDATE,2)5FROM PROJECT;

TASK STARTDATE ORIGINAL_ ADD_MONTH -------------- --------- --------- --------- KICKOFF MTG 01-APR-95 01-APR-95 01-JUN-95 TECH SURVEY 02-APR-95 01-MAY-95 01-JUL-95 USER MTGS 15-MAY-95 30-MAY-95 30-JUL-95 DESIGN WIDGET 01-JUN-95 30-JUN-95 31-AUG-95 CODE WIDGET 01-JUL-95 02-SEP-95 02-NOV-95 TESTING 03-SEP-95 17-JAN-96 17-MAR-96 6 rows selected.

Not that a slip like this is possible, but it's nice to have a function that makes
it so easy. `ADD_MONTHS` also works outside the `SELECT` clause. Typing

SQL>SELECT TASK TASKS_SHORTER_THAN_ONE_MONTH2FROM PROJECT3WHERE ADD_MONTHS(STARTDATE,1) > ENDDATE;

produces the following result:

TASKS_SHORTER_THAN_ONE_MONTH ---------------------------- KICKOFF MTG TECH SURVEY USER MTGS DESIGN WIDGET

You will find that all the functions in this section work in more than one place.
However, `ADD MONTHS` does not work with other data types like character or
number without the help of functions `TO_CHAR` and `TO_DATE`, which
are discussed later today.

`LAST_DAY` returns the last Chapter of a specified month. It is for those of
us who haven't mastered the "Thirty days has September..." rhyme--or at
least those of us who have not yet taught it to our computers. If, for example, you
need to know what the last Chapter of the month is in the column `ENDDATE`, you
would type

SQL>SELECT ENDDATE, LAST_DAY(ENDDATE)2FROM PROJECT;

Here's the result:

ENDDATE LAST_DAY(ENDDATE) --------- ----------------- 01-APR-95 30-APR-95 01-MAY-95 31-MAY-95 30-MAY-95 31-MAY-95 30-JUN-95 30-JUN-95 02-SEP-95 30-SEP-95 17-JAN-96 31-JAN-96

6 rows selected.

How does `LAST DAY` handle leap years?

SQL>SELECT LAST_DAY('1-FEB-95') NON_LEAP,2LAST_DAY('1-FEB-96') LEAP3FROM PROJECT;NON_LEAP LEAP --------- --------- 28-FEB-95 29-FEB-96 28-FEB-95 29-FEB-96 28-FEB-95 29-FEB-96 28-FEB-95 29-FEB-96 28-FEB-95 29-FEB-96 28-FEB-95 29-FEB-96

6 rows selected.

You got the right result, but why were so many rows returned? Because you didn't specify an existing column or any conditions, the SQL engine applied the date functions in the statement to each existing row. Let's get something less redundant by using the following:

SQL>SELECT DISTINCT LAST_DAY('1-FEB-95') NON_LEAP,2LAST_DAY('1-FEB-96') LEAP3FROM PROJECT;

This statement uses the word `DISTINCT` (see Chapter 2, "Introduction
to the Query: The `SELECT` Statement") to produce the singular result

NON_LEAP LEAP --------- --------- 28-FEB-95 29-FEB-96

Unlike me, this function knows which years are leap years. But before you trust your own or your company's financial future to this or any other function, check your implementation!

If you need to know how many months fall between month x and month y, use `MONTHS_BETWEEN`
like this:

SQL>SELECT TASK, STARTDATE, ENDDATE,MONTHS_BETWEEN(STARTDATE,ENDDATE) DURATION2FROM PROJECT;

TASK STARTDATE ENDDATE DURATION -------------- --------- --------- --------- KICKOFF MTG 01-APR-95 01-APR-95 0 TECH SURVEY 02-APR-95 01-MAY-95 -.9677419 USER MTGS 15-MAY-95 30-MAY-95 -.483871 DESIGN WIDGET 01-JUN-95 30-JUN-95 -.9354839 CODE WIDGET 01-JUL-95 02-SEP-95 -2.032258 TESTING 03-SEP-95 17-JAN-96 -4.451613

6 rows selected.

Wait a minute--that doesn't look right. Try this:

SQL>SELECT TASK, STARTDATE, ENDDATE,2MONTHS_BETWEEN(ENDDATE,STARTDATE) DURATION3FROM PROJECT;TASK STARTDATE ENDDATE DURATION -------------- --------- --------- --------- KICKOFF MTG 01-APR-95 01-APR-95 0 TECH SURVEY 02-APR-95 01-MAY-95 .96774194 USER MTGS 15-MAY-95 30-MAY-95 .48387097 DESIGN WIDGET 01-JUN-95 30-JUN-95 .93548387 CODE WIDGET 01-JUL-95 02-SEP-95 2.0322581 TESTING 03-SEP-95 17-JAN-96 4.4516129

6 rows selected.

That's better. You see that `MONTHS_BETWEEN` is sensitive to the way you
order the months. Negative months might not be bad. For example, you could use a
negative result to determine whether one date happened before another. For example,
the following statement shows all the tasks that started before May 19, 1995:

SQL>SELECT *2FROM PROJECT3WHERE MONTHS_BETWEEN('19 MAY 95', STARTDATE) > 0;

TASK STARTDATE ENDDATE -------------- --------- --------- KICKOFF MTG 01-APR-95 01-APR-95 TECH SURVEY 02-APR-95 01-MAY-95 USER MTGS 15-MAY-95 30-MAY-95

If you need to adjust the time according to the time zone you are in, the `New_TIME`
function is for you. Here are the time zones you can use with this function:

Abbreviation |
Time Zone |

AST or ADT | Atlantic standard or daylight time |

BST or BDT | Bering standard or daylight time |

CST or CDT | Central standard or daylight time |

EST or EDT | Eastern standard or daylight time |

GMT | Greenwich mean time |

HST or HDT | Alaska-Hawaii standard or daylight time |

MST or MDT | Mountain standard or daylight time |

NST | Newfoundland standard time |

PST or PDT | Pacific standard or daylight time |

YST or YDT | Yukon standard or daylight time |

You can adjust your time like this:

SQL>SELECT ENDDATE EDT,2NEW_TIME(ENDDATE, 'EDT','PDT')3FROM PROJECT;

EDT NEW_TIME(ENDDATE ---------------- ---------------- 01-APR-95 1200AM 31-MAR-95 0900PM 01-MAY-95 1200AM 30-APR-95 0900PM 30-MAY-95 1200AM 29-MAY-95 0900PM 30-JUN-95 1200AM 29-JUN-95 0900PM 02-SEP-95 1200AM 01-SEP-95 0900PM 17-JAN-96 1200AM 16-JAN-96 0900PM

6 rows selected.

Like magic, all the times are in the new time zone and the dates are adjusted.

`NEXT_DAY` finds the name of the first Chapter of the week that is equal to
or later than another specified date. For example, to send a report on the Friday
following the first Chapter of each event, you would type

SQL>SELECT STARTDATE,2NEXT_DAY(STARTDATE, 'FRIDAY')3FROM PROJECT;

which would return

STARTDATE NEXT_DAY( --------- --------- 01-APR-95 07-APR-95 02-APR-95 07-APR-95 15-MAY-95 19-MAY-95 01-JUN-95 02-JUN-95 01-JUL-95 07-JUL-95 03-SEP-95 08-SEP-95

6 rows selected.

The output tells you the date of the first FriChapter that occurs after your `STARTDATE`.

`SYSDATE` returns the system time and date:

SQL>SELECT DISTINCT SYSDATE2FROM PROJECT;

SYSDATE ---------------- 18-JUN-95 1020PM

If you wanted to see where you stood toChapter in a certain project, you could type

SQL>SELECT *2FROM PROJECT3WHERE STARTDATE > SYSDATE;TASK STARTDATE ENDDATE -------------- --------- --------- CODE WIDGET 01-JUL-95 02-SEP-95 TESTING 03-SEP-95 17-JAN-96

Now you can see what parts of the project start after today.

Many of the uses you have for the data you retrieve involve mathematics. Most
implementations of SQL provide arithmetic functions similar to the functions covered
here. The examples in this section use the `NUMBERS` table:

SQL>SELECT *2FROM NUMBERS;

A B --------- --------- 3.1415 4 -45 .707 5 9 -57.667 42 15 55 -7.2 5.3 6 rows selected.

The `ABS` function returns the absolute value of the number you point to.
For example:

SQL>SELECT ABS(A) ABSOLUTE_VALUE2FROM NUMBERS;

ABSOLUTE_VALUE -------------- 3.1415 45 5 57.667 15 7.2 6 rows selected.

`ABS` changes all the negative numbers to positive and leaves positive
numbers alone.

`CEIL` returns the smallest integer greater than or equal to its argument.
`FLOOR` does just the reverse, returning the largest integer equal to or less
than its argument. For example:

SQL>SELECT B, CEIL(B) CEILING2FROM NUMBERS;

B CEILING --------- --------- 4 4 .707 1 9 9 42 42 55 55 5.3 6

6 rows selected.

And

SQL>SELECT A, FLOOR(A) FLOOR2FROM NUMBERS;A FLOOR --------- --------- 3.1415 3 -45 -45 5 5 -57.667 -58 15 15 -7.2 -8

6 rows selected.

The `COS`, `SIN`, and `TAN` functions provide support for
various trigonometric concepts. They all work on the assumption that n is in radians.
The following statement returns some unexpected values if you don't realize `COS`
expects `A` to be in radians.

SQL>SELECT A, COS(A)2FROM NUMBERS;

A COS(A) --------- --------- 3.1415 -1 -45 .52532199 5 .28366219 -57.667 .437183 15 -.7596879 -7.2 .60835131

You would expect the `COS` of `45` degrees to be in the neighborhood
of `.707`, not `.525`. To make this function work the way you would
expect it to in a degree-oriented world, you need to convert degrees to radians.
(When was the last time you heard a news broadcast report that a politician had done
a pi-radian turn? You hear about a 180-degree turn.) Because 360 degrees = 2 pi radians,
you can write

SQL>SELECT A, COS(A* 0.01745329251994)2FROM NUMBERS;

A COS(A*0.01745329251994) --------- ----------------------- 3.1415 .99849724 -45 .70710678 5 .9961947 -57.667 .5348391 15 .96592583 -7.2 .9921147

Note that the number `0.01745329251994` is radians divided by degrees.
The trigonometric functions work as follows:

SQL>SELECT A, COS(A*0.017453), COSH(A*0.017453)2FROM NUMBERS;

A COS(A*0.017453) COSH(A*0.017453) --------- --------------- ---------------- 3.1415 .99849729 1.0015035 -45 .70711609 1.3245977 5 .99619483 1.00381 -57.667 .53485335 1.5507072 15 .96592696 1.0344645 -7.2 .99211497 1.0079058

6 rows selected.

And

SQL>SELECT A, SIN(A*0.017453), SINH(A*0.017453)2FROM NUMBERS;

A SIN(A*0.017453) SINH(A*0.017453) --------- --------------- ---------------- 3.1415 .05480113 .05485607 -45 -.7070975 -.8686535 5 .08715429 .0873758 -57.667 -.8449449 -1.185197 15 .25881481 .26479569 -7.2 -.1253311 -.1259926

6 rows selected.

And

SQL>SELECT A, TAN(A*0.017453), TANH(A*0.017453)2FROM NUMBERS;A TAN(A*0.017453) TANH(A*0.017453) --------- --------------- ---------------- 3.1415 .05488361 .05477372 -45 -.9999737 -.6557867 5 .08748719 .08704416 -57.667 -1.579769 -.7642948 15 .26794449 .25597369 -7.2 -.1263272 -.1250043

6 rows selected.

`EXP` enables you to raise *e* (*e* is a mathematical constant
used in various formulas) to a power. Here's how `EXP` raises *e* by
the values in column A:

SQL>SELECT A, EXP(A)2FROM NUMBERS;

A EXP(A) --------- --------- 3.1415 23.138549 -45 2.863E-20 5 148.41316 -57.667 9.027E-26 15 3269017.4 -7.2 .00074659

6 rows selected.

These two functions center on logarithms. `LN` returns the natural logarithm
of its argument. For example:

SQL>SELECT A, LN(A)2FROM NUMBERS;

ERROR: ORA-01428: argument '-45' is out of range

Did we neglect to mention that the argument had to be positive? Write

SQL>SELECT A, LN(ABS(A))2FROM NUMBERS;A LN(ABS(A)) --------- ---------- 3.1415 1.1447004 -45 3.8066625 5 1.6094379 -57.667 4.0546851 15 2.7080502 -7.2 1.974081

6 rows selected.

Notice how you can embed the function `ABS` inside the `LN` call.
The other logarith-mic function, `LOG`, takes two arguments, returning the
logarithm of the first argument in the base of the second. The following query returns
the logarithms of column B in base 10.

SQL>SELECT B, LOG(B, 10)2FROM NUMBERS;B LOG(B,10) ----------- --------- 4 1.660964 .707 -6.640962 9 1.0479516 42 .61604832 55 .57459287 5.3 1.3806894

6 rows selected.

You have encountered `MOD` before. On Chapter 3, "Expressions, Conditions,
and Operators," you saw that the ANSI standard for the modulo operator `%`
is sometimes implemented as the function `MOD`. Here's a query that returns
a table showing the remainder of A divided by B:

SQL>SELECT A, B, MOD(A,B)2FROM NUMBERS;

A B MOD(A,B) --------- --------- --------- 3.1415 4 3.1415 -45 .707 -.459 5 9 5 -57.667 42 -15.667 15 55 15 -7.2 5.3 -1.9

6 rows selected.

To raise one number to the power of another, use `POWER`. In this function
the first argument is raised to the power of the second:

SQL>SELECT A, B, POWER(A,B)2FROM NUMBERS;

ERROR: ORA-01428: argument '-45' is out of range

At first glance you are likely to think that the first argument can't be negative.
But that impression can't be true, because a number like -4 can be raised to a power.
Therefore, if the first number in the `POWER` function is negative, the second
must be an integer. You can work around this problem by using `CEIL` (or `FLOOR`):

SQL>SELECT A, CEIL(B), POWER(A,CEIL(B))2FROM NUMBERS;

A CEIL(B) POWER(A,CEIL(B)) --------- --------- ---------------- 3.1415 4 97.3976 -45 1 -45 5 9 1953125 -57.667 42 9.098E+73 15 55 4.842E+64 -7.2 6 139314.07

6 rows selected.

That's better!

`SIGN` returns `-1` if its argument is less than `0`, `0`
if its argument is equal to `0`, and `1` if its argument is greater
than `0`, as shown in the following example:

SQL>SELECT A, SIGN(A)2FROM NUMBERS;

A SIGN(A) --------- --------- 3.1415 1 -45 -1 5 1 -57.667 -1 15 1 -7.2 -1 0 0

7 rows selected.

You could also use `SIGN` in a `SELECT WHERE` clause like this:

SQL>SELECT A2FROM NUMBERS3WHERE SIGN(A) = 1;

A --------- 3.1415 5 15

The function `SQRT` returns the square root of an argument. Because the
square root of a negative number is undefined, you cannot use `SQRT` on negative
numbers.

SQL>SELECT A, SQRT(A)2FROM NUMBERS;

ERROR: ORA-01428: argument '-45' is out of range

However, you can fix this limitation with `ABS`:

SQL>SELECT ABS(A), SQRT(ABS(A))2FROM NUMBERS;ABS(A) SQRT(ABS(A)) --------- ------------ 3.1415 1.7724277 45 6.7082039 5 2.236068 57.667 7.5938791 15 3.8729833 7.2 2.6832816 0 0

7 rows selected.

Many implementations of SQL provide functions to manipulate characters and strings
of characters. This section covers the most common character functions. The examples
in this section use the table `CHARACTERS`.

SQL>SELECT * FROM CHARACTERS;

LASTNAME FIRSTNAME M CODE --------------- --------------- - --------- PURVIS KELLY A 32 TAYLOR CHUCK J 67 CHRISTINE LAURA C 65 ADAMS FESTER M 87 COSTALES ARMANDO A 77 KONG MAJOR G 52

6 rows selected.

`CHR` returns the character equivalent of the number it uses as an argument.
The character it returns depends on the character set of the database. For this example
the database is set to ASCII. The column `CODE` includes numbers.

SQL>SELECT CODE, CHR(CODE)2FROM CHARACTERS;

CODE CH --------- -- 32 67 C 65 A 87 W 77 M 52 4

6 rows selected.

The space opposite the `32` shows that `32` is a space in the ASCII
character set.

You used the equivalent of this function on Chapter 3, when you learned about operators.
The `||` symbol splices two strings together, as does `CONCAT`. It
works like this:

SQL>SELECT CONCAT(FIRSTNAME, LASTNAME) "FIRST AND LAST NAMES"2FROM CHARACTERS;

FIRST AND LAST NAMES ------------------------ KELLY PURVIS CHUCK TAYLOR LAURA CHRISTINE FESTER ADAMS ARMANDO COSTALES MAJOR KONG 6 rows selected.

Quotation marks surround the multiple-word alias `FIRST AND LAST NAMES`.
Again, it is safest to check your implementation to see if it allows multiple-word
aliases.

Also notice that even though the table looks like two separate columns, what you
are seeing is one column. The first value you concatenated, `FIRSTNAME`, is
15 characters wide. This operation retained all the characters in the field.

`INITCAP` capitalizes the first letter of a word and makes all other characters
lowercase.

SQL>SELECT FIRSTNAME BEFORE, INITCAP(FIRSTNAME) AFTER2FROM CHARACTERS;

BEFORE AFTER -------------- ---------- KELLY Kelly CHUCK Chuck LAURA Laura FESTER Fester ARMANDO Armando MAJOR Major

6 rows selected.

As you might expect, `LOWER` changes all the characters to lowercase; `UPPER`
does just the reverse.

The following example starts by doing a little magic with the `UPDATE`
function (you learn more about this next week) to change one of the values to lowercase:

SQL>UPDATE CHARACTERS2SET FIRSTNAME = 'kelly'3WHERE FIRSTNAME = 'KELLY';

1 row updated.

SQL>SELECT FIRSTNAME2FROM CHARACTERS;

FIRSTNAME --------------- kelly CHUCK LAURA FESTER ARMANDO MAJOR

6 rows selected.

Then you write

SQL>SELECT FIRSTNAME, UPPER(FIRSTNAME), LOWER(FIRSTNAME)2FROM CHARACTERS;

FIRSTNAME UPPER(FIRSTNAME LOWER(FIRSTNAME --------------- --------------- --------------- kelly KELLY kelly CHUCK CHUCK chuck LAURA LAURA laura FESTER FESTER fester ARMANDO ARMANDO armando MAJOR MAJOR major

6 rows selected.

Now you see the desired behavior.

`LPAD` and `RPAD` take a minimum of two and a maximum of three arguments.
The first argument is the character string to be operated on. The second is the number
of characters to pad it with, and the optional third argument is the character to
pad it with. The third argument defaults to a blank, or it can be a single character
or a character string. The following statement adds five pad characters, assuming
that the field `LASTNAME` is defined as a 15-character field:

SQL>SELECT LASTNAME, LPAD(LASTNAME,20,'*')2FROM CHARACTERS;

LASTNAME LPAD(LASTNAME,20,'*' -------------- -------------------- PURVIS *****PURVIS TAYLOR *****TAYLOR CHRISTINE *****CHRISTINE ADAMS *****ADAMS COSTALES *****COSTALES KONG *****KONG

6 rows selected.

Why were only five pad characters added? Remember that the `LASTNAME` column
is 15 characters wide and that `LASTNAME` includes the blanks to the right
of the characters that make up the name. Some column data types eliminate padding
characters if the width of the column value is less than the total width allocated
for the column. Check your implementation. Now try the right side:

SQL>SELECT LASTNAME, RPAD(LASTNAME,20,'*')2FROM CHARACTERS;

LASTNAME RPAD(LASTNAME,20,'*' --------------- -------------------- PURVIS PURVIS ***** TAYLOR TAYLOR ***** CHRISTINE CHRISTINE ***** ADAMS ADAMS ***** COSTALES COSTALES ***** KONG KONG *****

6 rows selected.

Here you see that the blanks are considered part of the field name for these operations. The next two functions come in handy in this type of situation.

`LTRIM` and `RTRIM` take at least one and at most two arguments.
The first argument, like `LPAD` and `RPAD`, is a character string.
The optional second element is either a character or character string or defaults
to a blank. If you use a second argument that is not a blank, these trim functions
will trim that character the same way they trim the blanks in the following examples.

SQL>SELECT LASTNAME, RTRIM(LASTNAME)2FROM CHARACTERS;

LASTNAME RTRIM(LASTNAME) --------------- --------------- PURVIS PURVIS TAYLOR TAYLOR CHRISTINE CHRISTINE ADAMS ADAMS COSTALES COSTALES KONG KONG

6 rows selected.

You can make sure that the characters have been trimmed with the following statement:

SQL>SELECT LASTNAME, RPAD(RTRIM(LASTNAME),20,'*')2FROM CHARACTERS;

LASTNAME RPAD(RTRIM(LASTNAME) --------------- -------------------- PURVIS PURVIS************** TAYLOR TAYLOR************** CHRISTINE CHRISTINE*********** ADAMS ADAMS*************** COSTALES COSTALES************ KONG KONG****************

6 rows selected.

The output proves that trim is working. Now try `LTRIM`:

SQL>SELECT LASTNAME, LTRIM(LASTNAME, 'C')2FROM CHARACTERS;

LASTNAME LTRIM(LASTNAME, --------------- --------------- PURVIS PURVIS TAYLOR TAYLOR CHRISTINE HRISTINE ADAMS ADAMS COSTALES OSTALES KONG KONG

6 rows selected.

Note the missing Cs in the third and fifth rows.

`REPLACE` does just that. Of its three arguments, the first is the string
to be searched. The second is the search key. The last is the optional replacement
string. If the third argument is left out or `NULL`, each occurrence of the
search key on the string to be searched is removed and is not replaced with anything.

SQL>SELECT LASTNAME, REPLACE(LASTNAME, 'ST') REPLACEMENT2FROM CHARACTERS;

LASTNAME REPLACEMENT --------------- --------------- PURVIS PURVIS TAYLOR TAYLOR CHRISTINE CHRIINE ADAMS ADAMS COSTALES COALES KONG KONG

6 rows selected.

If you have a third argument, it is substituted for each occurrence of the search key in the target string. For example:

SQL>SELECT LASTNAME, REPLACE(LASTNAME, 'ST','**') REPLACEMENT2FROM CHARACTERS;

LASTNAME REPLACEMENT --------------- ------------ PURVIS PURVIS TAYLOR TAYLOR CHRISTINE CHRI**INE ADAMS ADAMS COSTALES CO**ALES KONG KONG

6 rows selected.

If the second argument is `NULL`, the target string is returned with no
changes.

SQL>SELECT LASTNAME, REPLACE(LASTNAME, NULL) REPLACEMENT2FROM CHARACTERS;

LASTNAME REPLACEMENT --------------- --------------- PURVIS PURVIS TAYLOR TAYLOR CHRISTINE CHRISTINE ADAMS ADAMS COSTALES COSTALES KONG KONG

6 rows selected.

This three-argument function enables you to take a piece out of a target string. The first argument is the target string. The second argument is the position of the first character to be output. The third argument is the number of characters to show.

SQL>SELECT FIRSTNAME, SUBSTR(FIRSTNAME,2,3)2FROM CHARACTERS;

FIRSTNAME SUB --------------- --- kelly ell CHUCK HUC LAURA AUR FESTER EST ARMANDO RMA MAJOR AJO

6 rows selected.

If you use a negative number as the second argument, the starting point is determined by counting backwards from the end, like this: