Saturday, February 23, 2008

Getting Started with BASH Linux/Unix

What is the Bash Shell?
The GNU Bourne-Again SHell (BASH) incorporates features from the C Shell (csh) and the Korn Shell (ksh) and conforms to the POSTIX 2 shell specifications. It provides a Command Line Interface (CLI) for working on *nix systems and is the most common shell used on Linux systems. Useful bash features will be the subject of the rest of this document.
Bash's Configuration Files
Because what I want to say here has already been written I will quote the section entitled "Files used by Bash" from freeunix.dyndns.org's "Customizing your Bash environment"

In your home directory, 3 files have a special meaning to Bash, allowing you to set up your environment automatically when you log in and when you invoke another Bash shell, and allow you to execute commands when you log out.These files may exist in your home directory, but that depends largely on the Linux distro you're using and how your sysadmin (if not you) has set up your account. If they're missing, Bash defaults to /etc/profile.You can easily create these files yourself using your favorite texteditor. They are:
.bash_profile : read and the commands in it executed by Bash every time you log in to the system
.bashrc : read and executed by Bash every time you start a subshell
.bash_logout : read and executed by Bash every time a login shell exits Bash allows 2 synonyms for .bash_profile : .bash_login and .profile. These are derived from the C shell's file named .login and from the Bourne shell and Korn shell files named .profile. Only one of these files is read when you log in. If .bash_profile isn't there, Bash will look for .bash_login. If that is missing too, it will look for .profile..bash_profile is read and executed only when you start a login shell (that is, when you log in to the system). If you start a subshell (a new shell) by typing bash at the command prompt, it will read commands from .bashrc. This allows you to separate commands needed at login from those needed when invoking a subshell. However, most people want to have the same commands run regardless of whether it is a login shell or a subshell. This can be done by using the source command from within .bash_profile to execute .bashrc. You would then simply place all the commands in .bashrc.
These files are useful for automatically executing commands like: set, alias, unalias, and setting the PS(1-4) variables, which can all be used to modify your bash environment.
Also use the source command to apply the changes that you have just made in a configuration file. For example if you add an alias to /etc/profile to apply the changes to your current session execute:
$ source /etc/profile
Modifying the Bash Shell with the set Command
Two options that can be set using the set command that will be of some interest to the common user are "-o vi" and "-o emacs". As with all of the environment modifying commands these can be typed at the command prompt or inserted into the appropriate file mentioned above.

Set Emacs Mode in Bash
$ set -o emacs
This is usually the default editing mode when in the bash environment and means that you are able to use commands like those in Emacs (defined in the Readline library) to move the cursor, cut and paste text, or undo editing.
Commands to take advantage of bash's Emacs Mode:
ctrl-a Move cursor to beginning of line
ctrl-e Move cursor to end of line
meta-b Move cursor back one word
meta-f Move cursor forward one word
ctrl-w Cut the last word
ctrl-u Cut everything before the cursor
ctrl-k Cut everything after the cursor
ctrl-y Paste the last thing to be cut
ctrl-_ Undo
NOTE: ctrl- = hold control, meta- = hold meta (where meta is usually the alt or escape key).
A combination of ctrl-u to cut the line combined with ctrl-y can be very helpful. If you are in middle of typing a command and need to return to the prompt to retrieve more information you can use ctrl-u to save what you have typed in and after you retrieve the needed information ctrl-y will recover what was cut.
Set Vi Mode in Bash
$ set -o vi
Vi mode allows for the use of vi like commands when at the bash prompt. When set to this mode initially you will be in insert mode (be able to type at the prompt unlike when you enter vi). Hitting the escape key takes you into command mode.
Commands to take advantage of bash's Vi Mode:
h Move cursor left
l Move cursor right
A Move cursor to end of line and put in insert mode
0 (zero) Move cursor to beginning of line (doesn't put in insert mode)
i Put into insert mode at current position
a Put into insert mode after current position
dd Delete line (saved for pasting)
D Delete text after current cursor position (saved for pasting)
p Paste text that was deleted
j Move up through history commands
k Move down through history commands
u Undo
Useful Commands and Features
The commands in this section are non-mode specific, unlike the ones listed above.

Flip the Last Two Characters
If you type like me your fingers spit characters out in the wrong order on occasion. ctrl-t swaps the order that the last two character appear in.
Searching Bash History
As you enter commands at the CLI they are saved in a file ~./.bash_history. From the bash prompt you can browse the most recently used commands through the least recently used commands by pressing the up arrow. Pressing the down arrow does the opposite.
If you have entered a command a long time ago and need to execute it again you can search for it. Type the command 'ctrl-r' and enter the text you want to search for.
Dealing with Spaces
First, I will mention a few ways to deal with spaces in directory names, file names, and everywhere else.

Using the Backslash Escape Sequence
One option is to use bash's escape character \. Any space following the backslash is treated as being part of the same string. These commands create a directory called "foo bar" and then remove it.
$ mkdir foo\ bar$ rm -r foo\ bar
The backslash escape sequence can also be used to decode commands embedded in strings which can be very useful for scripting or modifying the command prompt as discussed later.
Using Single/Double Quotes with Spaces and Variables
Single and double quotes can also be used for dealing with spaces.
$ touch 'dog poo'$ rm "dog poo"
The difference between single and double quotes being that in double quotes the $, \, and ' characters still preserve their special meanings. Single quotes will take the $ and \ literally and regard the ' as the end of the string. Here's an example:
$ MY_VAR='This is my text'$ echo $MY_VARThis is my text$ echo "$MY_VAR"This is my text$ echo '$MY_VAR'$MY_VAR
The string following the $ character is interpreted as being a variable except when enclosed in single quotes as shown above.
Lists Using { and }
The characters { and } allow for list creation. In other words you can have a command be executed on each item in the list. This is perhaps best explained with examples:
$ touch {temp1,temp2,temp3,temp4}
This will create/modify the files temp1, temp2, temp3, and temp4 and as in the example above when the files share common parts of the name you can do:
$ mv temp{1,2,3,4} ./foo\ bar/
This will move all four of the files into a directory 'foo bar'.
Executing Multiple Commands in Sequence
This is a hefty title for a simple task. Consider that you want to run three commands, one right after the other, and you do not want to wait for each to finish before typing the next. You can type all three commands on a line and then start the process:
$ ./configure; make; make installOR$ ./configure && make && make install
With the first if the ./configure fails the other two commands will continue to execute. With the second the commands following the && will only execute if the command previous finishes without error. Thus, the second would be most useful for this example because there is no reason to run 'make' or 'make install' if the configuration fails.
Piping Output from One Command to Another
Piping allows the user to do several fantastic thing by combining utilities. I will cover only very basic uses for piping. I most commonly use the pipe command, , to pipe text that is output from one command through the grep command to search for text.
Examples:
See if a program, centericq, is running:
$ ps ax grep centericq25824 pts/2 S 0:18 centericqCount the number of files in a directory (nl counts things):
$ ls nl1 #.emacs#2 BitchX3 Outcast double cd.lst4 bm.shader5 bmtexturesbase.pk3If my memory serves using RPM to check if a package is installed:
$ rpm -qa grep package_nameA more advance example:
$ cat /etc/passwd awk -F: '{print $1 "\t" $6}' sort > ./usersThis sequence takes the information if the file passwd, pipes it to awk, which takes the first and sixth fields (the user name and home directory respectively), pipes these fields separated by a tab ("\t") to sort, which sorts the list alphabetically, and puts it into a file called users.

Aliasing Commands
Once again I like how this topic is covered on freeunix.dyndns.org:8088 in "Customizing your Bash environment" I will quote the section entitled "Aliasses":
If you have used UNIX for a while, you will know that there are many commands available and that some of them have very cryptic names and/or can be invoked with a truckload of options and arguments. So, it would be nice to have a feature allowing you to rename these commands or type something simple instead of a list of options. Bash provides such a feature : the alias .Aliasses can be defined on the command line, in .bash_profile, or in .bashrc, using this form :
alias name=commandThis means that name is an alias for command. Whenever name is typed as a command, Bash will substitute command in its place. Note that there are no spaces on either side of the equal sign. Quotes around command are necessary if the string being aliassed consists of more than one word. A few examples :
alias ls='ls -aF --color=always'
alias ll='ls -l'
alias search=grep
alias mcd='mount /mnt/cdrom'
alias ucd='umount /mnt/cdrom'
alias mc='mc -c'
alias ..='cd ..'
alias ...='cd ../..' The first example ensures that ls always uses color if available, that dotfiles are listed as well,that directories are marked with a / and executables with a *. To make ls do the same on FreeBSD, the alias would become :
alias ls='/bin/ls -aFG' To see what aliasses are currently active, simply type alias at the command prompt and all active aliasses will be listed. To "disable" an alias type unalias followed by the alias name.
Altering the Command Prompt Look and Information
Bash has the ability to change how the command prompt is displayed in information as well as colour. This is done by setting the PS1 variable. There is also a PS2 variable. It controls what is displayed after a second line of prompt is added and is usually by default '> '. The PS1 variable is usually set to show some useful information by the Linux distribution you are running but you may want to earn style points by doing your own modifications.
Here are the backslash-escape special characters that have meaning to bash:
\a an ASCII bell character (07)
\d the date in "Weekday Month Date" format
(e.g., "Tue May 26")
\e an ASCII escape character (033)
\h the hostname up to the first `.'
\H the hostname
\j the number of jobs currently managed by the shell
\l the basename of the shell's terminal device name
\n newline
\r carriage return
\s the name of the shell, the basename of $0
(the portion following the final slash)
\t the current time in 24-hour HH:MM:SS format
\T the current time in 12-hour HH:MM:SS format
\@ the current time in 12-hour am/pm format
\u the username of the current user
\v the version of bash (e.g., 2.00)
\V the release of bash, version + patchlevel
(e.g., 2.00.0)
\w the current working directory
\W the basename of the current working direcory
\! the history number of this command
\# the command number of this command
\$ if the effective UID is 0, a #, otherwise a $
\nnn the character corresponding to the octal number nnn
\\ a backslash
\[ begin a sequence of non-printing characters,
which could be used to embed a terminal control
sequence into the prompt
\] end a sequence of non-printing characters

Colours In Bash:
Black 0;30 Dark Gray 1;30
Blue 0;34 Light Blue 1;34
Green 0;32 Light Green 1;32
Cyan 0;36 Light Cyan 1;36
Red 0;31 Light Red 1;31
Purple 0;35 Light Purple 1;35
Brown 0;33 Yellow 1;33
Light Gray 0;37 White 1;37
Here is an example borrowed from the Bash-Prompt-HOWTO:
PS1="\[\033[1;34m\][\$(date +%H%M)][\u@\h:\w]$\[\033[0m\] "
This turns the text blue, displays the time in brackets (very useful for not losing track of time while working), and displays the user name, host, and current directory enclosed in brackets. The "\[\033[0m\]" following the $ returns the colour to the previous foreground colour.
How about command prompt modification thats a bit more "pretty":
PS1="\[\033[1;30m\][\[\033[1;34m\]\u\[\033[1;30m\]@\[\033[0;35m\]\h\[\033[1;30m\]] \[\033[0;37m\]\W \[\033[1;30m\]\$\[\033[0m\] "
This one sets up a prompt like this: [user@host] directory $
Break down:
\[\033[1;30m\] - Sets the color for the characters that follow it. Here 1;30 will set them to Dark Gray. \u \h \W \$ - Look to the table above \[\033[0m\] - Sets the colours back to how they were originally.
Each user on a system can have their own customized prompt by setting the PS1 variable in either the .bashrc or .profile files located in their home directories.

FUN STUFF!
A quick note about bashish. It allows for adding themes to a terminal running under a GUI. Check out the site for some screen-shots of what it can do.
Also, the program fortune is a must [At least I have considered it so every since my Slackware days (it is default)]. It doesn't have anything to do with bash and is a program that outputs a quote to the screen. Several add-ons are available to make it say stuff about programming, the xfiles, futurama, starwars, and more. Just add a line in your /etc/profile like this to brighten your day when you log into your computer:
echo;fortune;echo
CDargs - Shell Bookmarks
Impress your friends and colleagues with lightening fast directory switching using the CDargs bookmarking tool. CDargs is not exclusive to BASH, but is a great addition and works on *nix based systems, including OS X. Download CDargs here in source or rpm.
CDargs allow for setting named marks in directories and moving to them quickly using the cdb command or a ncurses view.
Install
Compile / install source
Move cdargs-bash.sh to /etc
Add this line to your users .bashrc file
source /etc/cdargs-bash.sh
Relogin or run source ~/.bashrc
Usage
mark
Mark a directory that you want to get to quickly in the future. Move to the desired directory and type mark or simply mark to have it take the name of the current directory. You can also mark a directory using the ncurses tool. Run cdargs or cdb to start the ncurses tool. Add a new mark by pressing a.
cdb
Now you have a bunch of marked directories. Simply type cdb to move to the marked directory. Alternatively use cdb and navigate with arrows or number to the desired mark.
manage
Start the ncurses tool cdb. Some useful keys to thump:
a add new mark
d delete mark
e edit mark
right left arrows move in and out of directories
l list the files in the highlighted directory
c make a copy of a mark
enter go to selected directory / mark
You can also edit the ~/.cdargs text file directly to manage marks
Basic and Extended Bash Completion
Basic Bash Completion will work in any bash shell. It allows for completion of:
File Names
Directory Names
Executable Names
User Names (when they are prefixed with a ~)
Host Names (when they are prefixed with a @)
Variable Names (when they are prefixed with a $)
This is done simply by pressing the tab key after enough of the word you are trying to complete has been typed in. If when hitting tab the word is not completed there are probably multiple possibilities for the completion. Press tab again and it will list the possibilities. Sometimes on my machine I have to hit it a third time.
Extended Programmable Bash Completion is a program that you can install to complete much more than the names of the things listed above. With extended bash completion you can, for example, complete the name of a computer you are trying to connect to with ssh or scp. It achieves this by looking through the known_hosts file and using the hosts listed there for the completion. This is greatly customizable and the package and more information can be found here.
Configuration of Programmable Bash Completion is done in /etc/bash_completion. Here is a list of completions that are in my bash_completion file by default.
completes on signal names
completes on network interfaces
expands tildes in pathnames
completes on process IDs
completes on process group IDs
completes on user IDs
completes on group IDs
ifconfig(8) and iwconfig(8) helper function
bash alias completion
bash export completion
bash shell function completion
bash complete completion
service completion
chown(1) completion
chgrp(1) completion
umount(8) completion
mount(8) completion
Linux rmmod(8) completion
Linux insmod(8), modprobe(8) and modinfo(8) completion
man(1) completion
renice(8) completion
kill(1) completion
Linux and FreeBSD killall(1) completion
GNU find(1) completion
Linux ifconfig(8) completion
Linux iwconfig(8) completion
RedHat & Debian GNU/Linux if{up,down} completion
Linux ipsec(8) completion (for FreeS/WAN)
Postfix completion
cvs(1) completion
rpm completion
apt-get(8) completion
chsh(1) completion
chkconfig(8) completion
user@host completion
host completion based on ssh's known_hosts
ssh(1) completion
scp(1) completion
rsync(1) completion
Linux route(8) completion
GNU make(1) completion
GNU tar(1) completion
jar(1) completion
Linux iptables(8) completion
tcpdump(8) completion
autorpm(8) completion
ant(1) completion
mysqladmin(1) completion
gzip(1) completion
bzip2(1) completion
openssl(1) completion
screen(1) completion
lftp(1) bookmark completion
ncftp(1) bookmark completion
gdb(1) completion
Postgresql completion
psql(1) completion
createdb(1) completion
dropdb(1) completion
gcc(1) completion
Linux cardctl(8) completion
Debian dpkg(8) completion
Debian GNU dpkg-reconfigure(8) completion
Debian Linux dselect(8) completion
Java completion
PINE address-book completion
mutt completion
Debian reportbug(1) completion
Debian querybts(1) completion
update-alternatives completion
Python completion
Perl completion
rcs(1) completion
lilo(8) completion
links completion
FreeBSD package management tool completion
FreeBSD kernel module commands
FreeBSD portupgrade completion
FreeBSD portinstall completion
Slackware Linux removepkg completion
look(1) completion
ypcat(1) and ypmatch(1) completion
mplayer(1) completion
KDE dcop completion
wvdial(1) completion
gpg(1) completion
iconv(1) completion
dict(1) completion
cdrecord(1) completion
mkisofs(8) completion
mc(1) completion
yum(8) completion
yum-arch(8) completion
ImageMagick completion

Next Tutorial .htaccess

Introduction
In the last part I introduced you to .htaccess and some of its useful features. In this part I will show you how to use the .htaccess file to implement some of these.
Stop A Directory Index From Being Shown
Sometimes, for one reason or another, you will have no index file in your directory. This will, of course, mean that if someone types the directory name into their browser, a full listing of all the files in that directory will be shown. This could be a security risk for your site.

To prevent against this (without creating lots of new 'index' files, you can enter a command into your .htaccess file to stop the directory list from being shown:
Options -Indexes
Deny/Allow Certian IP Addresses
In some situations, you may want to only allow people with specific IP addresses to access your site (for example, only allowing people using a particular ISP to get into a certian directory) or you may want to ban certian IP addresses (for example, keeping disruptive memembers out of your message boards). Of course, this will only work if you know the IP addresses you want to ban and, as most people on the internet now have a dynamic IP address, so this is not always the best way to limit usage.
You can block an IP address by using:
deny from 000.000.000.000
where 000.000.000.000 is the IP address. If you only specify 1 or 2 of the groups of numbers, you will block a whole range.
You can allow an IP address by using:
allow from 000.000.000.000
where 000.000.000.000 is the IP address. If you only specify 1 or 2 of the groups of numbers, you will allow a whole range.
If you want to deny everyone from accessing a directory, you can use:
deny from all
but this will still allow scripts to use the files in the directory.
Alternative Index Files
You may not always want to use index.htm or index.html as your index file for a directory, for example if you are using PHP files in your site, you may want index.php to be the index file for a directory. You are not limited to 'index' files though. Using .htaccess you can set foofoo.blah to be your index file if you want to!
Alternate index files are entered in a list. The server will work from left to right, checking to see if each file exists, if none of them exisit it will display a directory listing (unless, of course, you have turned this off).
DirectoryIndex index.php index.php3 messagebrd.pl index.html index.htm
Redirection
One of the most useful functions of the .htaccess file is to redirect requests to different files, either on the same server, or on a completely different web site. It can be extremely useful if you change the name of one of your files but allow users to still find it. Another use (which I find very useful) is to redirect to a longer URL, for example in my newsletters I can use a very short URL for my affiliate links. The following can be done to redirect a specific file:
Redirect /location/from/root/file.ext http://www.tutorial-on.com/new/file/location.xyz
In this above example, a file in the root directory called oldfile.html would be entered as:
/oldfile.html
and a file in the old subdirectory would be entered as:
/old/oldfile.html
You can also redirect whole directoires of your site using the .htaccess file, for example if you had a directory called olddirectory on your site and you had set up the same files on a new site at: http://www.newsite.com/newdirectory/ you could redirect all the files in that directory without having to specify each one:
Redirect /olddirectory http://www.tutorial-on.com/newdirectory
Then, any request to your site below /olddirectory will bee redirected to the new site, with the
extra information in the URL added on, for example if someone typed in:
This can prove to be extremely powerful if used correctly.
Although there are many uses of the .htaccess file, by far the most popular, and probably most useful, is being able to relaibly password protect directories on websites. Although JavaScript etc. can also be used to do this, only .htaccess has total security (as someone must know the password to get into the directory, there are no 'back doors')
The .htaccess File
Adding password protection to a directory using .htaccess takes two stages. The first part is to add the appropriate lines to your .htaccess file in the directory you would like to protect. Everything below this directory will be password protected:
AuthName "Section Name"
AuthType Basic
AuthUserFile /full/path/to/.htpasswd
Require valid-user
There are a few parts of this which you will need to change for your site. You should replace "Section Name" with the name of the part of the site you are protecting e.g. "Members Area".
The /full/parth/to/.htpasswd should be changed to reflect the full server path to the .htpasswd file (more on this later). If you do not know what the full path to your webspace is, contact your system administrator for details.
The .htpasswd File
Password protecting a directory takes a little more work than any of the other .htaccess functions because you must also create a file to contain the usernames and passwords which are allowed to access the site. These should be placed in a file which (by default) should be called .htpasswd. Like the .htaccess file, this is a file with no name and an 8 letter extension. This can be placed anywhere within you website (as the passwords are encrypted) but it is advisable to store it outside the web root so that it is impossible to access it from the web.
Entering Usernames And Passwords
Once you have created your .htpasswd file (you can do this in a standard text editor) you must enter the usernames and passwords to access the site. They should be entered as follows:
username:password
where the password is the encrypted format of the password. To encrypt the password you will either need to use one of the premade scripts available on the web or write your own. There is a good username/password service at the KxS site which will allow you to enter the user name and password and will output it in the correct format.
For multiple users, just add extra lines to your .htpasswd file in the same format as the first. There are even scripts available for free which will manage the .htpasswd file and will allow automatic adding/removing of users etc.
Accessing The Site
When you try to access a site which has been protected by .htaccess your browser will pop up a standard username/password dialog box. If you don't like this, there are certain scripts available which allow you to embed a username/password box in a website to do the authentication. You can also send the username and password (unencrypted) in the URL as follows:
http://username:password@www.tutorial-on.com/directory/
Summary
.htaccess is one of the most useful files a webmaster can use. There are a wide variety of different uses for it which can save time and increase security on your website.

Friday, February 22, 2008

.htaccess Tutorial

Introduction
In this tutorial you will find out about the .htaccess file and the power it has to improve your website. Although .htaccess is only a file, it can change settings on the servers and allow you to do many different things, the most popular being able to have your own custom 404 error pages. .htaccess isn't difficult to use and is really just made up of a few simple instructions in a text file.
Will My Host Support It?

This is probably the hardest question to give a simple answer to. Many hosts support .htaccess but don't actually publicise it and many other hosts have the capability but do not allow their users to have a .htaccess file. As a general rule, if your server runs Unix or Linux, or any version of the Apache web server it will support .htaccess, although your host may not allow you to use it.
A good sign of whether your host allows .htaccess files is if they support password protection of folders. To do this they will need to offer .htaccess (although in a few cases they will offer password protection but not let you use .htaccess). The best thing to do if you are unsure is to either upload your own .htaccess file and see if it works or e-mail your web host and ask them.
What Can I Do?
You may be wondering what .htaccess can do, or you may have read about some of its uses but don't realise how many things you can actually do with it.
There is a huge range of things .htaccess can do including: password protecting folders, redirecting users automatically, custom error pages, changing your file extensions, banning users with certian IP addresses, only allowing users with certain IP addresses, stopping directory listings and using a different file as the index file.
Creating A .htaccess File
Creating a .htaccess file may cause you a few problems. Writing the file is easy, you just need enter the appropriate code into a text editor (like notepad). You may run into problems with saving the file. Because .htaccess is a strange file name (the file actually has no name but a 8 letter file extension) it may not be accepted on certain systems (e.g. Windows 3.1). With most operating systems, though, all you need to do is to save the file by entering the name as:
".htaccess"
(including the quotes). If this doesn't work, you will need to name it something else (e.g. htaccess.txt) and then upload it to the server. Once you have uploaded the file you can then rename it using an FTP program.
Warning
Before beginning using .htaccess, I should give you one warning. Although using .htaccess on your server is extremely unlikely to cause you any problems (if something is wrong it simply won't work), you should be wary if you are using the Microsoft FrontPage Extensions. The FrontPage extensions use the .htaccess file so you should not really edit it to add your own information. If you do want to (this is not recommended, but possible) you should download the .htaccess file from your server first (if it exists) and then add your code to the beginning.
Custom Error Pages
The first use of the .htaccess file which I will cover is custom error pages. These will allow you to have your own, personal error pages (for example when a file is not found) instead of using your host's error pages or having no page. This will make your site seem much more professional in the unlikely event of an error. It will also allow you to create scripts to notify you if there is an error (for example I use a PHP script on Free Webmaster Help to automatically e-mail me when a page is not found).
You can use custom error pages for any error as long as you know its number (like 404 for page not found) by adding the following to your .htaccess file:
ErrorDocument errornumber /file.html
For example if I had the file notfound.html in the root directory of my site and I wanted to use it for a 404 error I would use:
ErrorDocument 404 /notfound.html
If the file is not in the root directory of your site, you just need to put the path to it:
ErrorDocument 500 /errorpages/500.html
These are some of the most common errors:
401 - Authorization Required400 - Bad request403 - Forbidden500 - Internal Server Error404 - Wrong page
Then, all you need to do is to create a file to display when the error happens and upload it and the .htaccess file.
Part 2
In part 2 I will show you how to use some of the other .htaccess functions to improve your website.

Thursday, February 21, 2008

PHP and Cookies

This section of the tutorial covers the use of the PHP scripting language to set and read cookies. Cookies in PHP are not difficult to implement, and there are only two commands that need to be used with them. PHP makes it easy to set and read cookies and provides all the features needed to give their details.
Setting a Basic Cookie
The PHP function for setting cookies is called:
setcookie()

It is a PHP function which can be used without returning a value (for example you can simply execute a setcookie()) command, or you can take the return value and use it. The setcookie() function returns a boolean (true or false) value depending on whether it is successful. So you could execute:
if(setcookie()){echo "Cookie set";}else{echo "Cookie not set";}
For the purposes of this tutorial, though, we will not be using the return value, instead simply setting the cookie.
The most basic information for a cookie is it's name and it's value. The name of the cookie must be something which you can refer to it later as. You don't need to worry about it clashing with other sites as cookie names are site specific but you should try and use a descriptive and unique name for your cookies.
For this first example, assume that you have used PHP to load the user's name into the variable $name and want to greet the user in the future by their name. You would need to create a cookie which stores their name as follows:
setcookie("UsersName",$name);
This creates the most basic of cookies, storing the user's name in a cookie called 'UsersName'. By setting cookies like this, you don't set any specific options, so by default the cookie will be available to the domain in which it was set (e.g. yoursite.com) and will be deleted when the user closes their browser.
Reading Cookie Values
PHP makes it extremely simple to read the value of a cookie. In PHP, reading form values are achieved using $_GET and $_POST. PHP has a similar global variable for cookies:
$_COOKIE['CookieName'];
This variable contains the value of the cookie with name 'CookieName'. So on your website, if you wanted to display the name of the user, you could simply use the following:
echo "Hello, ".$_COOKIE['UsersName']."! Welcome back!";
Of course, the user may not already have the cookie, so you should use the PHP function isset. This returns true if a variable has been set and false if not. Using this, your site could do the following:
if(isset($_COOKIE['UsersName']){echo "Hello, ".$_COOKIE['UsersName']."! Welcome back!";}else{setcookie("UsersName",$name);}
Cookie Settings
Although the code I have given you allows you to set a simple cookie on the user's computer, it isn't very powerful because, for example, it is lost when the browser closes. One of the most powerful features of cookies is the ability to set and expiry date for the cookie. The cookie will remain on the users computer until the expiry date, then will automatically delete itself.
To set a cookie with an expiry date, use:
setcookie("UsersName", $name, time()+3600);
This code takes the current time (using time()) and then adds 3600 seconds to it, and uses this value to set as the expiry time for the cookie. Basically this means that the cookie will remain on the user's computer for an hour (it expires 3600 seconds (1 hour) from the current time). For one week (for example) you would set the cookie as:
setcookie("UsersName", $name, time()+604800);
There are three other options which can be used when setting cookies. Firstly the path. This refers to where in the domain you are able to access the cookie in future. By default this is the current directory (so if you set the cookie at the page: www.mysite.com/scripts/setcookie.php, it would only be available to scripts in the scripts directory and below). You can set this to any part of your site, though, which can be useful in some situations.
A second setting you can change is the domain. By default, a cookie is only available in the domain you set it in, for example if you set the cookie on http://www.mysite.com/ you can only ever access it from http://www.mysite.com/ (and not mail.mysite.com etc.). The most common need to change this setting is to allow the cookie to be viewed across all subdomains of a site. This can be done by setting the domain to .yoursite.com (with both .s). By doing this anything.yoursite.com is accepted, not just http://www.yoursite.com/.
Finally, a cookie has the option to be set as a secure cookie. If this is turned on, the cookie will only ever be surrendered to the site over a secure connection, not an insecure one.
The following code shows the imiplementation of a cookie with all settings specified:
setcookie("UsersName", $name, time()+3600, "/", ".mysite.com", 1);
The cookie set here, is called 'UsersName' and again stores the value $name. It will expire an hour from the current time. It is available in all directories of the site (/ is the root directory). It is available across any subdomain of the site mysite.com as '.mysite.com' has been given as the domain. The final 1 means that this is a secure cookie, and can only be transmitted over a secure connection. This would be 0 for a standard (non-secure) cookie.
Deleting Cookies
There are occasions on which you may wish to delete a cookie from a user's computer. This could be if, for example, you want to log the user out of a system (perhaps they are on a public computer). Deleting a cookie is quite simple to do because all you have to do is to set the expiry time in the past. By doing this, the cookie will be automatically deleted as soon as it is created, and will remove any data that already exists there. The simplest way is using:
setcookie("UsersName", "", time()-3600);
This sets the expiry time in the past so it should be deleted immediately. There is also no information stored in the cookie.
There is a known problem with this, though. Although it works in most cases, there can be problems if a user's timezone is set wrongly. The safest way to completely delete a cookie is to use the following:
setcookie("UsersName", "", mktime(12,0,0,1, 1, 1990));
The mktime() function is a PHP function for setting up a time specified. The time specified here is in the year 1990, so even a badly configured computer should still delete the cookie immediately.
Conclusion
This short section of the tutorial should cover all the information you will need to set up, manage and delete cookies in PHP. Using other PHP scripting techniques you can store more data in a cookie (for example using it to interface with a database). All the information here, though, should allow you to do practically anything you need to with your cookie. If you want to learn about cookies with other programming languages, this same information is available in the other parts of the tutorial for them

Cookies Tutorial

Introduction
Cookies are a technology which can be easily and simply used by a Webmaster to achieve a great many very useful tasks when creating websites. Although cookies are well known to users, many people are not really sure what they are used for, and a large amount of webmasters don't realise the possibilities open to them when they use cookies. Others have been put off, thinking that they must be difficult to use, but in reality, cookies can be set and used by a simple command in most scripting languages. In this tutorial I'll cover setting and using cookies in PHP, JavaScript and ASP, as well as giving some basic information on how cookies can be used.

What Is A Cookie?
Apart from being a type of biscuit, a cookie is also a very useful piece of technology for use on the web. One of the problems which many websites need to overcome is that there is no way of directly finding out who is on a website. Although many details about the user (such as their browser, IP address and operating system) are available, the use of dynamic IP addresses (which change every time the user logs on) and IP address sharing (so that many people share the same IP) mean that there is no reliable way of recognising a particular user when they re-visit a website.
Cookies overcome this problem. They basically give the website owner the opportunity to store a little piece of information on a user's computer which they can then retrieve at a later date. Cookies are just tiny text files (only up to 4Kb in size) and a website can write them to the user's computer via the web browser. The same website can then request the cookie from the user and, if it exists, the value stored will be reported back to the website. The cookie can persist on the user's computer, staying there if the browser is closed, the computer is switched off and if the internet connection is changed.
What Use Is A Cookie?
So why would anyone want to store 4000 characters of text on a user's computer? It isn't enough to put anything really worthwhile on there! The power of the cookie, though, is to recognise a site visitor over and over again. To give just a few uses of cookies:
Many portals and search engines use them to provide customized pages and results to their users, allowing such features as 'My Yahoo' etc.
Many websites use cookies to log their users in automatically. By storing a few pieces of user information they can automatically authenticate the user's details and use them to save the user time when they log in>/li>
Visitor tracking and statistics systems often use them to track visitors. By assigning the visitor a cookie, they will not be counted more than once, so accurate unique visitor statistics can be obtained. Also, if a user has a unique cookie the system can 'follow' them through a website, showing the webmaster exactly where the visitor has been, and in what order.
Using Cookies
A cookie is a very basic data file. It has a name and a value and also stores the address of websites which are allowed to access it and an expiry time. Basically, a website will set a cookie and give it a name and value. This name is used by the website to refer to it, and no other website can access the cookie, even if they know it's name. The name should be unique to the website, but it doesn't matter if it clashes with the name of a cookie from another website.
The cookie (as mentioned before) can only store up to 4000 characters of data. This is enough to store lots of information about a user so if, for example, you wanted to store the user preferences for a search engine (much like Google does), you could simply list the preferences in the cookie. If you wanted to store more data, you would have to store a unique ID in the cookie, which matched up with a database record, and you could th
en access the user's data this way.
To retrieve data, the website simply has to request if the user has a cookie with a particular name. If the user does, the value is returned to the script and it can be dealt with however the website owner chooses (for example a name stored in a cookie could be returned, a user ID could be loaded from a database, or a record could be made of a user visiting a site).
Every cookie is assigned an expiry date and time. It is up to the website owner to decide how long the cookie should exist for. Many owners may just choose to set the cookie for an hour, meaning it is only available for the user's single session. This is common in visitor tracking. Other cookies could be set for much longer. Maybe a week or a month (often used for affiliate program tracking) or even several years (often used for user preferences).
Cookie Security
Despite much worrying in the news a few years ago, cookies pose no real danger to users. Unless they are really worried about themselves being recognised by a website, they are harmless. The browser actually writes and reads cookies from the computer when requested to by a website, so a malicious website cannot damage the computer.
For webmasters, there are some security concerns. When the cookie is set, the domain(s) which can access it are set. Usually this is just the website who set the cookie. This makes them relatively secure, as you can be sure that your competitor cannot load your cookie from one of your visitors' computers (they cannot even find out if it exisits).
One major security problem with cookies, though, is that they can easily be read by anyone using the computer. They are just a simple text file, so you should not under any circumstances store passwords in cookies. A common way to log people in automatically is to store an encrypted version of their password, which can then be matched with an encrypted version on the server. Another method is to store a unique ID and a unique validation number on the user's system. This is then referenced in a database to the user's account. This way, no actual details are stored and a malicious user cannot simply guess users' IDs (as there is the validation number).
This Tutorial
This introduction has covered some of the basics of cookies and how they are used. The next three sections cover the setting and reading of cookies using four of the most common scripting languages available. Each page is a self contained description of how to set and read cookies for that language, so you should now jump ahead to the section for your chosen language.

Caching Tutorial

What’s a Web Cache? Why do people use them?
A Web cache sits between one or more Web servers (also known as origin servers) and a client or many clients, and watches requests come by, saving copies of the responses — like HTML pages, images and files (collectively known as representations) — for itself. Then, if there is another request for the same URL, it can use the response that it has, instead of asking the origin server for it again.

There are two main reasons that Web caches are used:
To reduce latency — Because the request is satisfied from the cache (which is closer to the client) instead of the origin server, it takes less time for it to get the representation and display it. This makes the Web seem more responsive.
To reduce network traffic — Because representations are reused, it reduces the amount of bandwidth used by a client. This saves money if the client is paying for traffic, and keeps their bandwidth requirements lower and more manageable.
Kinds of Web Caches
Browser Caches
If you examine the preferences dialog of any modern Web browser (like Internet Explorer, Safari or Mozilla), you’ll probably notice a “cache” setting. This lets you set aside a section of your computer’s hard disk to store representations that you’ve seen, just for you. The browser cache works according to fairly simple rules. It will check to make sure that the representations are fresh, usually once a session (that is, the once in the current invocation of the browser).
This cache is especially useful when users hit the “back” button or click a link to see a page they’ve just looked at. Also, if you use the same navigation images throughout your site, they’ll be served from browsers’ caches almost instantaneously.
Proxy Caches
Web proxy caches work on the same principle, but a much larger scale. Proxies serve hundreds or thousands of users in the same way; large corporations and ISPs often set them up on their firewalls, or as standalone devices (also known as intermediaries).
Because proxy caches aren’t part of the client or the origin server, but instead are out on the network, requests have to be routed to them somehow. One way to do this is to use your browser’s proxy setting to manually tell it what proxy to use; another is using interception. Interception proxies have Web requests redirected to them by the underlying network itself, so that clients don’t need to be configured for them, or even know about them.
Proxy caches are a type of shared cache; rather than just having one person using them, they usually have a large number of users, and because of this they are very good at reducing latency and network traffic. That’s because popular representations are reused a number of times.
Gateway Caches
Also known as “reverse proxy caches” or “surrogate caches,” gateway caches are also intermediaries, but instead of being deployed by network administrators to save bandwidth, they’re typically deployed by Webmasters themselves, to make their sites more scalable, reliable and better performing.
Requests can be routed to gateway caches by a number of methods, but typically some form of load balancer is used to make one or more of them look like the origin server to clients.
Content delivery networks (CDNs) distribute gateway caches throughout the Internet (or a part of it) and sell caching to interested Web sites. Speedera and Akamai are examples of CDNs.
This tutorial focuses mostly on browser and proxy caches, although some of the information is suitable for those interested in gateway caches as well.
Aren’t Web Caches bad for me? Why should I help them?
Web caching is one of the most misunderstood technologies on the Internet. Webmasters in particular fear losing control of their site, because a proxy cache can “hide” their users from them, making it difficult to see who’s using the site.
Unfortunately for them, even if Web caches didn’t exist, there are too many variables on the Internet to assure that they’ll be able to get an accurate picture of how users see their site. If this is a big concern for you, this tutorial will teach you how to get the statistics you need without making your site cache-unfriendly.
Another concern is that caches can serve content that is out of date, or stale. However, this tutorial can show you how to configure your server to control how your content is cached
On the other hand, if you plan your site well, caches can help your Web site load faster, and save load on your server and Internet link. The difference can be dramatic; a site that is difficult to cache may take several seconds to load, while one that takes advantage of caching can seem instantaneous in comparison. Users will appreciate a fast-loading site, and will visit more often.
Think of it this way; many large Internet companies are spending millions of dollars setting up farms of servers around the world to replicate their content, in order to make it as fast to access as possible for their users. Caches do the same for you, and they’re even closer to the end user. Best of all, you don’t have to pay for them.
The fact is that proxy and browser caches will be used whether you like it or not. If you don’t configure your site to be cached correctly, it will be cached using whatever defaults the cache’s administrator decides upon.
How Web Caches Work
All caches have a set of rules that they use to determine when to serve a representation from the cache, if it’s available. Some of these rules are set in the protocols (HTTP 1.0 and 1.1), and some are set by the administrator of the cache (either the user of the browser cache, or the proxy administrator).
Generally speaking, these are the most common rules that are followed (don’t worry if you don’t understand the details, it will be explained below):
If the response’s headers tell the cache not to keep it, it won’t.
If the request is authenticated or secure, it won’t be cached.
If no validator (an ETag or Last-Modified header) is present on a response, and it doesn't have any explicit freshness information, it will be considered uncacheable.
A cached representation is considered fresh (that is, able to be sent to a client without checking with the origin server) if:
It has an expiry time or other age-controlling header set, and is still within the fresh period.
If a browser cache has already seen the representation, and has been set to check once a session.
If a proxy cache has seen the representation recently, and it was modified relatively long ago. Fresh representations are served directly from the cache, without checking with the origin server.
If an representation is stale, the origin server will be asked to validate it, or tell the cache whether the copy that it has is still good.
Together, freshness and validation are the most important ways that a cache works with content. A fresh representation will be available instantly from the cache, while a validated representation will avoid sending the entire representation over again if it hasn’t changed.
How (and how not) to Control Caches
There are several tools that Web designers and Webmasters can use to fine-tune how caches will treat their sites. It may require getting your hands a little dirty with your server’s configuration, but the results are worth it. For details on how to use these tools with your server, see the Implementation sections below.
HTML Meta Tags and HTTP Headers
HTML authors can put tags in a document’s section that describe its attributes. These meta tags are often used in the belief that they can mark a document as uncacheable, or expire it at a certain time.
Meta tags are easy to use, but aren’t very effective. That’s because they’re only honored by a few browser caches (which actually read the HTML), not proxy caches (which almost never read the HTML in the document). While it may be tempting to put a Pragma: no-cache meta On the other hand, true HTTP headers give you a lot of control over how both browser caches and proxies handle your representations. They can’t be seen in the HTML, and are usually automatically generated by the Web server. However, you can control them to some degree, depending on the server you use. In the following sections, you’ll see what HTTP headers are interesting, and how to apply them to your site.
HTTP headers are sent by the server before the HTML, and only seen by the browser and any intermediate caches. Typical HTTP 1.1 response headers might look like this:HTTP/1.1 200 OK
Date: Fri, 30 Oct 1998 13:19:41 GMT
Server: Apache/1.3.3 (Unix)
Cache-Control: max-age=3600, must-revalidate
Expires: Fri, 30 Oct 1998 14:19:41 GMT
Last-Modified: Mon, 29 Jun 1998 02:28:12 GMT
ETag: "3e86-410-3596fbbc"
Content-Length: 1040
Content-Type: text/html
The HTML would follow these headers, separated by a blank line. See the Implementation sections for information about how to set HTTP headers.
Pragma HTTP Headers (and why they don’t work)
Many people believe that assigning a Pragma: no-cache HTTP header to a representation will make it uncacheable. This is not necessarily true; the HTTP specification does not set any guidelines for Pragma response headers; instead, Pragma request headers (the headers that a browser sends to a server) are discussed. Although a few caches may honor this header, the majority won’t, and it won’t have any effect. Use the headers below instead.
Controlling Freshness with the Expires HTTP Header
The Expires HTTP header is a basic means of controlling caches; it tells all caches how long the associated representation is fresh for. After that time, caches will always check back with the origin server to see if a document is changed. Expires headers are supported by practically every cache.
Most Web servers allow you to set Expires response headers in a number of ways. Commonly, they will allow setting an absolute time to expire, a time based on the last time that the client saw the representation (last access time), or a time based on the last time the document changed on your server (last modification time).
Expires headers are especially good for making static images (like navigation bars and buttons) cacheable. Because they don’t change much, you can set extremely long expiry time on them, making your site appear much more responsive to your users. They’re also useful for controlling caching of a page that is regularly changed. For instance, if you update a news page once a day at 6am, you can set the representation to expire at that time, so caches will know when to get a fresh copy, without users having to hit ‘reload’.
The only value valid in an Expires header is a HTTP date; anything else will most likely be interpreted as ‘in the past’, so that the representation is uncacheable. Also, remember that the time in a HTTP date is Greenwich Mean Time (GMT), not local time.
For example:Expires: Fri, 30 Oct 1998 14:19:41 GMT
Although the Expires header is useful, it has some limitations. First, because there’s a date involved, the clocks on the Web server and the cache must be synchronised; if they have a different idea of the time, the intended results won’t be achieved, and caches might wrongly consider stale content as fresh.
Another problem with Expires is that it’s easy to forget that you’ve set some content to expire at a particular time. If you don’t update an Expires time before it passes, each and every request will go back to your Web server, increasing load and latency.
Cache-Control HTTP Headers
HTTP 1.1 introduced a new class of headers, Cache-Control response headers, to give Web publishers more control over their content, and to address the limitations of Expires.
Useful Cache-Control response headers include:
max-age=[seconds] — specifies the maximum amount of time that an representation will be considered fresh. Similar to Expires, this directive is relative to the time of the request, rather than absolute. [seconds] is the number of seconds from the time of the request you wish the representation to be fresh for.
s-maxage=[seconds] — similar to max-age, except that it only applies to shared (e.g., proxy) caches.
public — marks authenticated responses as cacheable; normally, if HTTP authentication is required, responses are automatically uncacheable.
no-cache — forces caches to submit the request to the origin server for validation before releasing a cached copy, every time. This is useful to assure that authentication is respected (in combination with public), or to maintain rigid freshness, without sacrificing all of the benefits of caching.
no-store — instructs caches not to keep a copy of the representation under any conditions.
must-revalidate — tells caches that they must obey any freshness information you give them about a representation. HTTP allows caches to serve stale representations under special conditions; by specifying this header, you’re telling the cache that you want it to strictly follow your rules.
proxy-revalidate — similar to must-revalidate, except that it only applies to proxy caches.
For example:Cache-Control: max-age=3600, must-revalidate
If you plan to use the Cache-Control headers, you should have a look at the excellent documentation in HTTP 1.1; see References and Further Information.
Validators and Validation
In How Web Caches Work, we said that validation is used by servers and caches to communicate when an representation has changed. By using it, caches avoid having to download the entire representation when they already have a copy locally, but they’re not sure if it’s still fresh.
Validators are very important; if one isn’t present, and there isn’t any freshness information (Expires or Cache-Control) available, caches will not store a representation at all.
The most common validator is the time that the document last changed, as communicated in Last-Modified header. When a cache has an representation stored that includes a Last-Modified header, it can use it to ask the server if the representation has changed since the last time it was seen, with an If-Modified-Since request.
HTTP 1.1 introduced a new kind of validator called the ETag. ETags are unique identifiers that are generated by the server and changed every time the representation does. Because the server controls how the ETag is generated, caches can be surer that if the ETag matches when they make a If-None-Match request, the representation really is the same.
Almost all caches use Last-Modified times in determining if an representation is fresh; ETag validation is also becoming prevalent.
Most modern Web servers will generate both ETag and Last-Modified headers to use as validators for static content (i.e., files) automatically; you won’t have to do anything. However, they don’t know enough about dynamic content (like CGI, ASP or database sites) to generate them; see Writing Cache-Aware Scripts.
Tips for Building a Cache-Aware Site
Besides using freshness information and validation, there are a number of other things you can do to make your site more cache-friendly.
Use URLs consistently — this is the golden rule of caching. If you serve the same content on different pages, to different users, or from different sites, it should use the same URL. This is the easiest and most effective may to make your site cache-friendly. For example, if you use “/index.html” in your HTML as a reference once, always use it that way.
Use a common library of images and other elements and refer back to them from different places.
Make caches store images and pages that don’t change often by using a Cache-Control: max-age header with a large value.
Make caches recognize regularly updated pages by specifying an appropriate max-age or expiration time.
If a resource (especially a downloadable file) changes, change its name. That way, you can make it expire far in the future, and still guarantee that the correct version is served; the page that links to it is the only one that will need a short expiry time.
Don’t change files unnecessarily. If you do, everything will have a falsely young Last-Modified date. For instance, when updating your site, don’t copy over the entire site; just move the files that you’ve changed.
Use cookies only where necessary — cookies are difficult to cache, and aren’t needed in most situations. If you must use a cookie, limit its use to dynamic pages.
Minimize use of SSL — because encrypted pages are not stored by shared caches, use them only when you have to, and use images on SSL pages sparingly.
use the Cacheability Engine — it can help you apply many of the concepts in this tutorial.
Writing Cache-Aware Scripts
By default, most scripts won’t return a validator (a Last-Modified or ETag response header) or freshness information (Expires or Cache-Control). While some scripts really are dynamic (meaning that they return a different response for every request), many (like search engines and database-driven sites) can benefit from being cache-friendly.
Generally speaking, if a script produces output that is reproducable with the same request at a later time (whether it be minutes or days later), it should be cacheable. If the content of the script changes only depending on what’s in the URL, it is cacheble; if the output depends on a cookie, authentication information or other external criteria, it probably isn’t.
The best way to make a script cache-friendly (as well as perform better) is to dump its content to a plain file whenever it changes. The Web server can then treat it like any other Web page, generating and using validators, which makes your life easier. Remember to only write files that have changed, so the Last-Modified times are preserved.
Another way to make a script cacheable in a limited fashion is to set an age-related header for as far in the future as practical. Although this can be done with Expires, it’s probably easiest to do so with Cache-Control: max-age, which will make the request fresh for an amount of time after the request.
If you can’t do that, you’ll need to make the script generate a validator, and then respond to If-Modified-Since and/or If-None-Match requests. This can be done by parsing the HTTP headers, and then responding with 304 Not Modified when appropriate. Unfortunately, this is not a trival task.
Some other tips;
Don’t use POST unless it’s appropriate. Responses to the POST method aren’t kept by most caches; if you send information in the path or query (via GET), caches can store that information for the future.
Don’t embed user-specific information in the URL unless the content generated is completely unique to that user.
Don’t count on all requests from a user coming from the same host, because caches often work together.
Generate Content-Length response headers. It’s easy to do, and it will allow the response of your script to be used in a persistent connection. This allows clients to request multiple representations on one TCP/IP connection, instead of setting up a connection for every request. It makes your site seem much faster.
See the Implementation Notes for more specific information.
Frequently Asked Questions
What are the most important things to make cacheable?
A good strategy is to identify the most popular, largest representations (especially images) and work with them first.
How can I make my pages as fast as possible with caches?
The most cacheable representation is one with a long freshness time set. Validation does help reduce the time that it takes to see a representation, but the cache still has to contact the origin server to see if it’s fresh. If the cache already knows it’s fresh, it will be served directly.
I understand that caching is good, but I need to keep statistics on how many people visit my page!
If you must know every time a page is accessed, select ONE small item on a page (or the page itself), and make it uncacheable, by giving it a suitable headers. For example, you could refer to a 1x1 transparent uncacheable image from each page. The Referer header will contain information about what page called it.
Be aware that even this will not give truly accurate statistics about your users, and is unfriendly to the Internet and your users; it generates unnecessary traffic, and forces people to wait for that uncached item to be downloaded. For more information about this, see On Interpreting Access Statistics in the references.
How can I see a representation’s HTTP headers?
Many Web browsers let you see the Expires and Last-Modified headers are in a “page info” or similar interface. If available, this will give you a menu of the page and any representations (like images) associated with it, along with their details.
To see the full headers of a representation, you can manually connect to the Web server using a Telnet client.
To do so, you may need to type the port (be default, 80) into a separate field, or you may need to connect to www.example.com:80 or www.example.com 80 (note the space). Consult your Telnet client’s documentation.
Once you’ve opened a connection to the site, type a request for the representation. For instance, if you want to see the headers for http://www.example.com/foo.html, connect to www.example.com, port 80, and type:GET /foo.html HTTP/1.1 [return]
Host: www.example.com [return][return]
Press the Return key every time you see [return]; make sure to press it twice at the end. This will print the headers, and then the full representation. To see the headers only, substitute HEAD for GET.
My pages are password-protected; how do proxy caches deal with them?
By default, pages protected with HTTP authentication are considered private; they will not be kept by shared caches. However, you can make authenticated pages public with a Cache-Control: public header; HTTP 1.1-compliant caches will then allow them to be cached.
If you’d like such pages to be cacheable, but still authenticated for every user, combine the Cache-Control: public and no-cache headers. This tells the cache that it must submit the new client’s authentication information to the origin server before releasing the representation from the cache. This would look like:Cache-Control: public, no-cache
Whether or not this is done, it’s best to minimize use of authentication; for example, if your images are not sensitive, put them in a separate directory and configure your server not to force authentication for it. That way, those images will be naturally cacheable.
Should I worry about security if people access my site through a cache?
SSL pages are not cached (or decrypted) by proxy caches, so you don’t have to worry about that. However, because caches store non-SSL requests and URLs fetched through them, you should be conscious about unsecured sites; an unscrupulous administrator could conceivably gather information about their users, especially in the URL.
In fact, any administrator on the network between your server and your clients could gather this type of information. One particular problem is when CGI scripts put usernames and passwords in the URL itself; this makes it trivial for others to find and user their login.
If you’re aware of the issues surrounding Web security in general, you shouldn’t have any surprises from proxy caches.
I’m looking for an integrated Web publishing solution. Which ones are cache-aware?
It varies. Generally speaking, the more complex a solution is, the more difficult it is to cache. The worst are ones which dynamically generate all content and don’t provide validators; they may not be cacheable at all. Speak with your vendor’s technical staff for more information, and see the Implementation notes below.
My images expire a month from now, but I need to change them in the caches now!
The Expires header can’t be circumvented; unless the cache (either browser or proxy) runs out of room and has to delete the representations, the cached copy will be used until then.
The most effective solution is to change any links to them; that way, completely new representations will be loaded fresh from the origin server. Remember that the page that refers to an representation will be cached as well. Because of this, it’s best to make static images and similar representations very cacheable, while keeping the HTML pages that refer to them on a tight leash.
If you want to reload an representation from a specific cache, you can either force a reload (in Firefox, holding down shift while pressing ‘reload’ will do this by issuing a Pragma: no-cache request header) while using the cache. Or, you can have the cache administrator delete the representation through their interface.
I run a Web Hosting service. How can I let my users publish cache-friendly pages?
If you’re using Apache, consider allowing them to use .htaccess files and providing appropriate documentation.
Otherwise, you can establish predetermined areas for various caching attributes in each virtual server. For instance, you could specify a directory /cache-1m that will be cached for one month after access, and a /no-cache area that will be served with headers instructing caches not to store representations from it.
Whatever you are able to do, it is best to work with your largest customers first on caching. Most of the savings (in bandwidth and in load on your servers) will be realized from high-volume sites.
I’ve marked my pages as cacheable, but my browser keeps requesting them on every request. How do I force the cache to keep representations of them?
Caches aren’t required to keep a representation and reuse it; they’re only required to not keep or use them under some conditions. All caches make decisions about which representations to keep based upon their size, type (e.g., image vs. html), or by how much space they have left to keep local copies. Yours may not be considered worth keeping around, compared to more popular or larger representations.
Some caches do allow their administrators to prioritize what kinds of representations are kept, and some allow representations to be “pinned” in cache, so that they’re always available.
Implementation Notes — Web Servers
Generally speaking, it’s best to use the latest version of whatever Web server you’ve chosen to deploy. Not only will they likely contain more cache-friendly features, new versions also usually have important security and performance improvements.
Apache HTTP Server
Apache uses optional modules to include headers, including both Expires and Cache-Control. Both modules are available in the 1.2 or greater distribution.
The modules need to be built into Apache; although they are included in the distribution, they are not turned on by default. To find out if the modules are enabled in your server, find the httpd binary and run httpd -l; this should print a list of the available modules. The modules we’re looking for are mod_expires and mod_headers.
If they aren’t available, and you have administrative access, you can recompile Apache to include them. This can be done either by uncommenting the appropriate lines in the Configuration file, or using the -enable-module=expires and -enable-module=headers arguments to configure (1.3 or greater). Consult the INSTALL file found with the Apache distribution.
Once you have an Apache with the appropriate modules, you can use mod_expires to specify when representations should expire, either in .htaccess files or in the server’s access.conf file. You can specify expiry from either access or modification time, and apply it to a file type or as a default. See the module documentation for more information, and speak with your local Apache guru if you have trouble.
To apply Cache-Control headers, you’ll need to use the mod_headers module, which allows you to specify arbitrary HTTP headers for a resource. See the mod_headers documentation.
Here’s an example .htaccess file that demonstrates the use of some headers.
.htaccess files allow web publishers to use commands normally only found in configuration files. They affect the content of the directory they’re in and their subdirectories. Talk to your server administrator to find out if they’re enabled. ### activate mod_expires
ExpiresActive On
### Expire .gif's 1 month from when they're accessed
ExpiresByType image/gif A2592000
### Expire everything else 1 day from when it's last modified
### (this uses the Alternative syntax)
ExpiresDefault "modification plus 1 day"
### Apply a Cache-Control header to index.html

Header append Cache-Control "public, must-revalidate"

Note that mod_expires automatically calculates and inserts a Cache-Control:max-age header as appropriate.
Apache 2.0’s configuration is very similar to that of 1.3; see the 2.0 mod_expires and mod_headers documentation for more information.
Microsoft IIS
Microsoft’s Internet Information Server makes it very easy to set headers in a somewhat flexible way. Note that this is only possible in version 4 of the server, which will run only on NT Server.
To specify headers for an area of a site, select it in the Administration Tools interface, and bring up its properties. After selecting the HTTP Headers tab, you should see two interesting areas; Enable Content Expiration and Custom HTTP headers. The first should be self-explanatory, and the second can be used to apply Cache-Control headers.
See the ASP section below for information about setting headers in Active Server Pages. It is also possible to set headers from ISAPI modules; refer to MSDN for details.
Netscape/iPlanet Enterprise Server

Wednesday, February 20, 2008

Bank ATM Security

ATM bank cash machines have been incorporated in our way of life. They offer a real convenience to those on the run, but at the same time offer an element of risk. Using a bank ATM machine safely requires awareness and a little planning. Just because a bank ATM machine is open and available 24-hours a day doesn't mean it is always safe to use it.

ATM Robbery Facts
Most bank ATM robberies occur at night between 7pm and midnight when the machine only produces 10% of the daily transactions. Between 7pm and 4am, the ATMs handle only 11% of the total daily transactions but suffer 60% of the crime.
Who Are the Robbers?
Bank ATM robbers are usually males under 25 years of age and most work alone. ATM robbers usually position themselves nearby (50 feet) waiting for a victim to approach and withdraw cash. Half of the ATM robberies occur after the cash withdrawal. Many ATM robbery victims are women and were alone when robbed. Most claim that they never saw the robber coming. Most ATM robbers used a gun or claimed to have a concealed weapon when confronting the victim and demanding their cash.
Pick a Safe Location
Use only bank ATM machines in well-lighted, high-traffic areas. ATMs inside busy supermarkets are considered safer. Don't use ATM machines that are remote or hidden such as being located behind buildings, behind pillars, walls, or away from public view. Beware of obvious hiding places like shrubbery or overgrown trees. ATM robbers like to have the element of surprise and no witnesses. Robbers like good escape routes like nearby freeway on-ramps or high speed thoroughfares.
Get a list of ATM locations from your bank and keep it in your car. Choose an ATM that looks and 'feels' safer, even if it is a couple of miles out of the way. Try and limit your use to daylight hours. Take someone with you after hours, if you can. When you drive up to an ATM location, scan the area for any suspicious persons. If you see anyone suspicious that is standing nearby or sitting alone in a car, drive away. When you approach an ATM on foot be prepared and have your ATM card ready. Memorize your personal PIN number to prevent loss and speed the transaction. After inserting your card and your PIN number keep an eye out behind you (the robbers always come from behind or the side). Never accept an offer to help or request for help from a suspicious male at the machine.
Be Alert
If anyone suspicious or seemingly dangerous approaches terminate your transaction and leave immediately, even if it means running away and leaving your ATM card in the machine. First, tell the suspicious male in a loud, firm voice to "back-off" and leave you alone. This is designed to startle the person and give you time to flee, if appropriate. It is far easier to apologize later or suffer a little embarrassment for your fear than to become a robbery victim. When you receive cash from the machine put it away immediately, extract your card, and walk away.
If you use your car at a bank drive-thru ATM machine the same rules apply. Make sure there are no obvious hiding places or suspicious persons loitering in the area. If there are, listen to your gut instinct and drive away. Keep your doors locked and the car in gear, with your foot firmly on the brake, while using the ATM machine. Keep a close eye on your rear and side view mirrors during the transaction. Robbers almost always approach from the rear on the drivers’ side. If you see anyone approaching, drive off even if it means leaving your ATM card behind. If you are confronted by an armed robber, just give up your money without argument. The cash is not worth serious injury or death. Get to a safe place and call the police immediately.
If you or your family members use ATM cash machines on a regular basis, here are some tips that can make the process a little safer:
Only use ATM machines in a well-lighted, open, high-traffic area
Use ATMs at inside busy supermarkets when possible
If lights around the ATM are not working, don't use that machine
Avoid bank ATM machines adjacent to obvious hiding places
When you approach an ATM, scan the area first for loiterers
Have your card ready and leave quickly, not counting your cash in public
Walk, run, or drive away immediately if your instincts tell you so
Beware of offers for help from strangers during an ATM transaction
Tell any suspicious male in a loud, firm voice to back-off
Don't argue with a robber, if confronted, and give up the cash
Don't fight with or attempt to follow the robber
Drive or walk to a safe place and immediately call the police
by Chris E McGoey, CPP, CSP, CAM

How to Make a User Defined TagLike

most things in CMS, adding a new plug-in is simple, although it's not quite like a holiday.To add your own plugin follow these steps...
1. The plugin editor is in the back-end so you need to login as admin, or a user with appropriate permissions.
2. In the admin panel click on 'Plugin Management' on the menu bar on the left.
3. At the bottom of the page click 'Add User Defined Tag'.
4. In the 'Name' text-box type the name of the tag. This is what you'll have to type in parenthesies to add a tag into a page so be descriptive but don't make it long.
5. In the 'Code' text-box type the php code that the tag will be replaced with when the page is requested. (Check the next section for more info)
6. Click on the 'Submit'. Your First UDTInstead of me rambling on about every last detail of plugins and php a simple example should help you get started. Following in the footsteps of all introductions to programming we'll try a hello world script first.Follow the steps above to create a new plug-in, 'Name' it "helloworld" (no quotes) and in the 'Code' box type/paste this code...echo "Hello World!";Click 'Submit'. To test the module, create a new 'Content Page' (see [{{Handbook.ContentDetails}} this guide]]), type "{helloword}" (no quotes) somewhere in the body and then click 'Preview'. Instead of seeing {helloworld} you should see "Hello World!".The echo command just writes what's in the quotes so we can also use this to add HTML, or even DHTML or Javascript objects. Try out this code, just edit the plugin we made by clicking on the edit icon next to it on the 'Plguin Managment' page. Try this code...echo "

Hello World!

";Test it in the same way as with the last one, you should now see something that looks a bit like this:Hello World!"But you can do all that with a html blob!" I hear you cry. Well, plugins get really useful when you start to add parameters. Parameters alow you to specify something in the tag. Let's say that we wanted to say hello to someone called Bob, try this code...echo "

Hello ".$params['name']."!

";This had added a parameter, "name", to the plugin, the contents of that parameter will be put there when the plugin is called from a tag in a page. Test it in the same way as before, but instead of using "{helloworld}" use "{helloworld name='Bob'}" as the tag to define the name parameter. You should see something like this...Hello Bob!That just about covers writing basic plugins, you can do some useful things with the echo command, parameters and a little imagination so just think what you can do if you learn a little PHP...My addition: You can access the page content in a user defined tag by passing it as a parameter:In your template: {content assign=pagecontent} {table_of_contents thepagecontent="$pagecontent"}In your user defined tag named "table_of_contents":echo $params['thepagecontent']; // Display page content.I use this so I can parse to the page content to automatically create a table of contents.

Tuesday, February 19, 2008

Introduction to XML Schema

XML Schema is an XML-based alternative to DTD.
An XML schema describes the structure of an XML document.
The XML Schema language is also referred to as XML Schema Definition (XSD).
What You Should Already Know

Before you continue you should have a basic understanding of the following:
HTML / XHTML
XML and XML Namespaces
A basic understanding of DTD
If you want to study these subjects first, find the tutorials on our Home page.
What is an XML Schema?
The purpose of an XML Schema is to define the legal building blocks of an XML document, just like a DTD.
An XML Schema:
defines elements that can appear in a document
defines attributes that can appear in a document
defines which elements are child elements
defines the order of child elements
defines the number of child elements
defines whether an element is empty or can include text
defines data types for elements and attributes
defines default and fixed values for elements and attributes
XML Schemas are the Successors of DTDs
We think that very soon XML Schemas will be used in most Web applications as a replacement for DTDs. Here are some reasons:
XML Schemas are extensible to future additions
XML Schemas are richer and more powerful than DTDs
XML Schemas are written in XML
XML Schemas support data types
XML Schemas support namespaces
XML Schema is a W3C Standard
XML Schema became a W3C Recommendation 02. May 2001.
You can read more about the XML Schema standard in our W3C tutorial.

Monday, February 18, 2008

The Port Forwarding Progression

If you are wondering where to start or what exactly needs to be done, you have come to the right place. This guide will provide a overview of what you need to do and know to forward ports to your computer. It will also provide an order to the guides on these pages, and help you avoid several pitfalls.

In order to forward properly you really need to understand what port forwarding is. This might seem like an irrelevant step, but it is not. Port forwarding is akin to driving a car. It's very difficult for most people. The more you know about port forwarding easier it will be to get your ports from point A to point B. Imagine if tried to goto the store without first learning how to drive your car. Well enough of this introduction, let's get to work. The very first thing you should do is to read the following guide which will explain what port forwarding is. Don't worry it's really short.What is port forwarding?
Now that you have some idea of what we are about, let's see if we can find your router. Your router should be some kind of box that your computer is connected to. The box might have a cable TV or phone line coming out of it. It probably has serveral flashing lights on the front of it. The cable that will run from your computer to this box is called a network cable. The ends of this cable look like the ends of a large phone cable. When you find this box, it will probably have a maker and model number on it. If you can't find the make and model number check the stickers on the bottom of the router. Write this information down. We will use it later.
Let's figure out what ports you need to forward. The first place to check is our Common Ports page. Some of the program names will be highlighted in orange. That means the name is a link, and you can click that name for further information. We have written guides for some programs. Go ahead and check the common ports page now for the program you want to forward ports for. If you find a guide for the program you are forward ports for, follow it. After you have completed that guide come back here for further information. If you did not see the program there, you will need to find that information on the internet. Usually the software manufacturer's website is the best place for that information. Sometimes it can be very hard to find out which ports you need forward for a program. Where ever you find the ports you need to forward, be sure to write that information down. There should be a series of ports listed, along with the protocol type of those ports. Usually this protocol type will be TCP or UDP.
Let's goto our Forwarding page. As you can see we have a few routers listed on that page. Go ahead and find your router on that list. If you found a guide for your router on our website, go ahead and click it to open it.
We need to setup a static ip address on the computer you are going to forward ports to. A lot of people struggle with this. Really it's not that tough, so don't worry. The first thing you need to do is to read our Understanding DHCP guide. I'm sorry but I got a little long winded on this one. I still think it is under one page. Now that you know a bit more about dhcp, take a look in the guides we have written for your router. You will probably see a Setting up a Static IP Address with your router guide. Go ahead and follow that guide now. If you can not connect to your router, make sure you are entering your computer's gateway into the web browser. If you are sure that you are entering your computer's gateway into your web browser and it's still giving you a page can not be found, your router is probably setup as bridge. Your computer would be behind your ISP's NAT. You should contact your ISP and ask them for a public ip address. If you can not connect to the internet after following that guide, it is probably because you have the wrong DNS servers. Give your ISP a call and ask them what DNS servers to use. They should be able to tell your right off. If they can't at least you can smile, because at this point you probably know more about networking than they do. Then go back to your TCP/IP configuration and put in the correct dns servers.
Now that you have setup a static ip address, you are ready to forward ports. You can use port forwarding or port triggering to forward ports. Generally you should use port forwarding. Only use port triggering when the software manufacturer provided specific port triggering settings for the program you are forwarding ports for. Never have the same port numbers defined in the port forwarding and port triggering page. Doing that basically screws things up, and neither configuration will work. Also do not put the same port numbers in more than one configuration. Doing that will also prevent those configurations from working properly. I'm not sure why people do that, but I've seen it often enough. Alright go ahead and open up the port forwarding or port triggering guide for your router. Remember those guides can be found on our Forwarding page. Forward all the ports that need to be forwarded for the program you are running. This will probably require setting up multiple configurations in your router. When you are done creating configurations remember to save those settings, and then reboot your router for the settings to take effect.
Alright! The ports should be forwarded. Now we need to make sure that there are no firewalls blocking those ports. Now there are a couple places that ports can be blocked. Your ISP can block ports in their router. Hopefully this is not the case, because there is little we can do about that. If your ISP is blocking the ports required by the program you are forwarding ports for, check the program for a port configuration. Sometimes programs will allow you set the port that it uses. You could then set that program to use some port that is not being blocked by your ISP. How can you tell which port is not being blocked? You can't. You really need to just try different ports, until you find one that works. Your router can have a firewall that is blocking ports from coming into your network. We have written a few guides for router firewalls on our Firewalling page. Take a look there for instructions on how to open ports in your routers firewall. You could have a personal firewall installed on your computer. You need to allow those ports through that firewall. Once again take a look at our Firewalling page. The ports you have forwarded need to be allowed through every personal firewall you have on your computer.
Sometimes things just don't work out. I'll give a couple suggestions here, that will hopefully help you fix any problems you encounter. If everything was done properly above, the ports should be forwarded. That is assuming your ISP is not blocking those ports. In your router you can DMZ your computer's ip address. Almost every router has a DMZ. The DMZ forwards all ports to the ip address that is specified in the DMZ. DMZs are really easy to setup, you just enter the ip address to forward the ports to. To test ports that are forwarded to your computer, you would enter your computers ip address. If the ports look like they are forwarded after you dmz your computer, you know that the ports were not forwarded properly. Disable the DMZ. Then go take another look at the port forwarding configuration in your router. If the ports are still not forwarded after you dmz your computer, there is probably a software firewall on your computer that is blocking those ports or your isp is blocking the ports. Take a really good look for a software firewall on your computer. You can also simplify the port forwarding problem, by disabling firewalls. Turn off all firewalls on your computer and then disable the firewall on your router. NAT(Network Address Translation) will act as a pretty good temporary security system. NAT is already enabled if you are forwarding ports. If the ports are forwarded after turning off the firewalls, you know that one of the firewalls was causing the problem. Turn the firewalls on one at a time, to figure out which one was causing the problem. Then open the ports that you forwarded in that firewall.
Well I hope you found this guide helpful. Good luck!
Dave Clark