Basic Web Info
Written by Dave Barth, October 2002
Forward
This paper is designed to provide elemental fundamentals for creating a simple Web page.
A short history of the Web
The Internet was conceived in the early 1960s as a solution to maintaining US defense communications channels if a war eliminated traditional communications
systems. A Rand think tank researcher, Paul Baran, wrote a paper entitled "On Distributed Communications Networks" that described a solution to maintaining
communications during an all-out war. The paper proposed a decentralized network of computers that would reroute communications if a computer in the network
were disabled.
In 1969 the Pentagon's Advanced Research Projects Agency (ARPA) funded the first packet-switching network, called ARPAnet, linking four research installations,
UCLA, Stanford Research Institute (SRI), the University of California at Santa Barbara (UCSB), and the University of Utah. By 1971 the network had grown to 15
nodes; by 1972 there were 40; and by the mid-1970s there were over 100 mainframe computers connected, including some overseas. The number of connected
computers has grown to hundreds of millions, and the number of server computers that store and send information is in the millions.
The first Internet application was email, in 1972; followed by Usenet for online conferencing; TCP/IP (Transmission Control Protocol/Internet Protocol), which
became the standard for network communications in 1982; FTP (File Transfer Protocol) for transmitting files between computers; Telnet for remote login to a
computer; UDP (User Datagram Protocol), a method that sends data for streaming media (video clips); and DNS (Domain Name System) to resolve domain
names by mapping numeric IP addresses to hostnames.
TCP
TCP (Transmission Control Protocol) controls the sending and receiving of information "packets" between a web server and a PC. It ensures that packets
arrive and that they are in the correct order. A message or data transmission consists of one or more information packets. On the sending end, the information
is broken into parts, or packets, and each packet is sent until they have all been transmitted. On the receiving end, each packet is arranged with the other packets
to recreate the original message.
IP
IP (Internet Protocol) encapsulates the data at the sending end and addresses them for their destination.
DNS
In 1983 the University of Wisconsin developed a DNS (Domain Name Server) which provided translation from numeric Internet addresses, such as 198.137.221.9,
to names that are easier to remember, for example, www.yahoo.com.
InterNIC
In 1993 InterNIC (Internet Network Information Center) was created to provide domain name registration. That function has since been taken over by hundreds of
commercial companies, including namesecure.com.
WWW
The advent of the World Wide Web (WWW) provided a graphic capability to the purely text-based Internet and the capability of linking to other Web sites. The
Web was developed in 1992 by CERN (the European High-Energy Particle Physics Lab) in Bern, Switzerland. It created three new technologies: HTML (Hypertext
Markup Language) used to create Web pages, a Web server computer using HTTP (Hyper Text Transfer Protocol) to transmit the pages, and a Web browser to display
Web pages.
HTTP
HTTP (Hypertext Markup Language) is a derivation of SGML (Standard Generalized Markup Language), a markup language used in printing and publishing.
Browsers such as Microsoft Internet Explorer and Netscape Navigator decode HTML to display text and graphics. An example of the HTML code used to display "Hello
World" follows, but to keep the browser you are looking at from converting it, the < and > codes have been converted to [ and ].
[html]
[title]
Display Hello World
[/title]
[body]
Hello World
[/body]
[/html]
HTTPS
HTTPS is the secure form of HTML (Hypertext Markup Language) which sets up a secure communication channel between a PC and a web server using a digital
certification to establish identity on the Internet. It encrypts information that is transmitted between a PC and a server.
RTSP
Another protocol is RTSP (Real Time Streaming Protocol), used for transmitting movies and film clips over the Internet.
Browser
The first Web browser, Mosaic, was developed by Marc Andreessen at the University of Illinois at Champaign-Urbana and was provided to users at no cost. The
purpose of a browser is to translate digital information into a multi-media view that can include text, graphics, sound, and motion pictures. By mid-1993 there were 130
Web sites. Now there are hundreds of millions. Later, Marc Andreessen developed Netscape Navigator.
Client-Side software
In 1995 Microsoft introduced Internet Explorer. Netscape Navigator and Microsoft Internet Explorer are "client-side" software, meaning that they run on PCs, not
Internet servers.
Server-Side software
Server-side software runs on Internet servers to provide web page, FTP, and other information to PCs. The most common server-side software, are Apache,
available at no cost from www.apache.org, and Microsoft IIS (Internet Information Server) available from www.microsoft.com. Apache runs on most server platforms,
but IIS only runs on Windows-based servers.
HTML Capitalization
HTML tags are not case sensitive. They can be in upper or lower case. Scripting languages and other languages, such as Perl, CGI, and Java, are case sensitive.
Also, graphics extensions, such as "jpg", "gif", and "png", are case sensitive.
Web addresses
When you set up a Web page, you may use any name as long as it has an extension of "htm" (or "html" if you are running on a Unix server). Web browsers
assume a default HTML name of "index.htm" (or "index.html"). Using the default name eliminates keying an additional node. For example, if I had called my Web page
HTML file "dave.htm" I would have to type the following URL (Uniform Resource Locator) to get to it:
http://user.aol.com/starsall/dave/dave.htm
However, since I used the default HTML file name of "index.htm," the URL is:
http://user.aol.com/starsall/dave.
Index.html indicates a Unix server is hosting the Web page, while index.htm indicates a Windows-based PC is hosting the Web page. Index.htm and index.html are
default web page names which means that they don't have to be used in the URL (Uniform Resource Locator) when addressing the page.
URL
A synonym for Uniform Resource Locator (URL) is Uniform Resource Identifier (URI). In my sample Web page at "Basic HTML Commands", notice that the
URL of the photograph near the bottom of the page is spelled out, in full. This is because it is in a different location from the HTML file. Had the photo been in the
same folder as the HTML file, it could have been referenced as simply "me.jpg".
URL Parts
The first part of the URL is the Protocol (e.g. http, ftp, rstp, https).
The second part of the URL is the host computer name (www).
The third part of the URL is the TLD (top level domain) name (e.g. yahoo, basicwebinfo, amazon).
The fourth part of the URL is the file requested and any folder it resides in. ("index.htm" and "index.html" are default file names and don't have to be included in the
URL.)
Digital images
Images that can be recognized by Web browsers must have extensions of "jpg" (Joint Photographic experts Group), "gif" (Graphics Interchange Format),
or "png" (Portable Network Graphics). Other image formats will not work, and older browsers do not support the png format. Image type conversion programs are
available on the Web.
Image optimizer utilities can be used to reduce the size of a graphic without substantially reducing the resolution. Two of them can be found at www.tucons.com
and http://161.58.236.158/cgi-bin/public/reducer.cgi. Those for sale include Adobe Photoshop and JASC's Paint Shop Pro.
Software that can build buttons, banners, and animations includes Xara, Photoshop, and Gifworks at gifworks.com.
For digital file sizing, file type conversions, animation, etc. try Pro Viewer at brandyware.com/viewer.htm.
Clip art can be found at many locations on the web, including grsites.com/webgraphics; free-graphics.com, barrysclipart.com, free-clipart.com, freegraphics.com,
cksinfo.com/indexz.htm, http://gifart.com/public, desktoppublishing.com, and arttoday.com.
There are three types of graphic images: vector graphics, raster graphics, and FIF (Fractal Image Format).
Vector graphics use a mathematical formula to describe the image. The advantage of vector graphics is that the image can be enlarged to any size without image
degradation. However, at the time of this writing, no browsers support vector graphics, although the probably will, sometime in the future.
Raster graphics are bit-mapped which means that a map of the bits is made when the image is produced. If a raster image is made larger than it was when it was
produced, the bits are expanded, but there are no bits to fill in between them, resulting in jagged edges around the image. The three graphic types compatible with
current browsers, "jpg", "gif", and "png" are raster image graphics.
FIF (Fractal Image Format) graphics are a combination of vector and raster graphic methodology. They are a bit map described in mathematical terms.
One method of making a moving icon or moving picture is to use a tool that combines several gif images into a single image. An example of such a software tool is
Anamagic, which can be downloaded from http://rtlsoft.com/anamagic.com. Gif is the only format that can be animated. The way an animated image is built is to
make several gif images that follow a sequence, and bind them together using an animation tool such as Animagic.
Checking Web page compatibility
Because Microsoft Internet Explorer and Netscape Navigator browsers implement the HTML standard a little differently, it is a good idea to see how your Web
page looks using each of those browsers.
FTP
FTP is "File Transfer Protocol," used to upload and download files between a server and a user's PC. Some ISPs (Internet Service Providers), such as AOL,
give their subscribers a built-in, point-and-click FTP method. To build a Web page with a graphic image, it is necessary to upload both the image and the HTML
file to a Web server. FTP client software programs can do this. Examples of free (or low cost) FTP software are SmartFTP at smartftp.com, and WS FTP LE at
tucows.com. One available for sale for around $40 at the time of this writing, is CuteFTP at cuteftp.com.
MODEM
Because voice phone lines operate in analog mode (a signal is carried across the wire as a series of waves) and PCs are digital (the signal consists of bits), a
modem (MOdulator DEModulator) is used to convert analog signals coming to the computer to digital and to convert digital computer information into analog for
transmission to the Web. Today's fastest modems for use with common voice telephone lines operate at 56kb/sec (56,000 bytes per second transfer rate). The
earliest commercial modems ran at 150 bytes per second and required the telephone handset to be inserted into a rubber "boot."
Bandwidth
Bandwidth is the amount of data that can be transmitted across a communication medium. The wider the bandwidth is, the more data that can be moved in a
given period of time. The minimum bandwidth in use today is the analog telephone line, consisting of a twisted pair of wires. The maximum bandwidth is a T3 line
which can be obtained, for some cost, from the local phone company.
Web Communications
Web users have many communication options for accessing the Web.
Twisted pair
The most common is the use of an analog telephone line, called twisted pair, or POTS (Plain Old Telephone Service). This method requires an inexpensive
modem costing under $50. Modern modems transfer data at 56kb/sec.
DSL
Digital Subscriber Line (DSL) uses the standard twisted pair, runs at about 150kb/sec, and allows phone calls to be transmitted during connection to the
internet. DSL requires a high-speed modem and a subscription from a telephone service provider. When using DSL, the telephone is still available for taking or
making phone calls.
ISDN
Integrated Services Digital Network (ISDN) is a special connection that can be set up by most telephone service providers. The transmission speed is about
128kb/second and is usually billed by metered rate, which means you pay according to the amount you use it.
Copper cable
Copper cable is another option, but the number of subscribers on the local loop affects transmission speed. The more PCs that are connected to the cable, the
slower is the overall performance. Special equipment is required. Copper cable is sometimes called "dark pipe."
Satellite Internet Service
Satellite Internet Service is at the speed of the PC modem for uplinks and very fast for downlinks. Special equipment is required. An example is StarBand which
requires the receiving dish to be placed by a professional for good data reception.
Fiber-optic Cable
Fiber-optic cable is extremely fast, is not impaired by electrical problems such as lightning strikes, and is sometimes called a "light pipe," as opposed to copper
cable which is called a "dark pipe." Fiber-optic cable can carry more data than copper cable.
Wireless
Wireless transmission is common in handheld Internet devices and PC or laptops that are installed in a home or office some distance from the receiving equipment.
These devices often double as telephones and use cellular phone sites or wireless transmitters for the transmission of Internet data. Wireless is also being used in
some college classrooms where each student has a laptop with a receiver. Nearby is a transciever (transmitter/receiver) which provides each student with a link to the
information pertaining to the class such as notes, slides, films, remote instruction, the internet, etc.
T1 and T3 lines
T1 and T3 lines are special communication lines set up by the telephone service provider. They provide very broad bandwidths to allow fast communications. As
might be expected, they are also very expensive. T1 lines have transmission speeds or 1.5mb/second and T3 lines run at 45 Mb/second. Fractional T1 line speeds are
around 256kb/second.
Stories
* In 1997, when I was trying to set up a Web page that could be read by people in different countries, I enlisted the help of several multilingual coworkers to translate
my text into their "first" language. I soon found that there was no way to represent the characters of Japanese, Chinese (Mandarin), or the accent characters of
Vietnamese using Microsoft Word. I finally photographed the text for those languages and displayed it on the Web as a graphic. Today, double-byte code is
available for displaying pictographic languages and accented character languages, like Vietnamese Tagalong, that use special accent marks.
* In 1997 a coworker asked me to put photos of his children on the Web. I asked him why he wanted them on the Web. He said that it was because he and his
wife had relatives all over the U.S., and it was too expensive and time-consuming to mail extra prints to all of them. Also, since the kids grew fast, I was able to
update the "Web photo album" every six months. All he had to do was provide me with the photos and email the relatives of the location of the latest photos. From
a security perspective, I put only the first name of each child in the photos and did not indicate where they lived.
* Non-profit organizations or small companies are candidates to practice Web page development on because they usually don't have sufficient funding to pay for a
Web presence. I used to set up a lot of these, especially for animal shelters such as The Max Fund and The Cat Care Society.
* In 1997 most owners of small businesses didn't know about a Web presence and really didn't care. I experimented with Web page development by going by their
place of business, taking a photograph (which I later scanned into a digital gif or jpg image), set up a Web page for them, printed it out, and gave it to them. Nearly
all of them had no idea what a Web page was, and the value of my efforts were simply the practice I got setting up Web pages.
* In 1999 a friend asked me to put photos of his 1971 Corvette on a Web page so he could sell it. It wasn't long before it sold for $18,000. I should have asked for a
commission!
Domain Names
A domain name is a name that is reserved for the purchaser as long as the purchaser continues making the annual renewal payment. Internec used to assign
names, and they were free. Today hundreds of companies provide that service, and the cost is around $10 for two years and $7 for subsequent years. Prices
have decreased during the past couple of years, and that trend may continue due to competition between the various companies providing domain
name services. These companies that are used to get the name have a search feature to indicate if that name has already been taken. If the name has not been
reserved, a person can purchase the name.
Domain names have several possible extensions. Most domain name providers allow you to reserve "com," "net," "org," "biz," "info," "mil," "edu," "gov," etc. domain name
extensions. "com" is used for commercial companies. "Org" is used by non-profit organizations. "Net" is for networks. Three-letter names with a "com" extension
are already reserved.
Most, if not all, words from the English dictionary are reserved. Many two-word combinations are also reserved. Like trying to find a vanity
license plate name, finding a domain name that has not been registered may require some searching. Other domain name extensions include "edu" for educational
institutions, "gov" for non-military government agencies, and "mil" for military organizations. Because "com" names have been pretty well used up, domain names
with a "cc" extension can be obtained from get.cc. Names with a "ws" extension can be obtained from worldsite.ws.
Domain Name providers allow you to point the
name to any other URL. That is the method used for pointing starattraction.com and nikonuniverse.com to my AOL account. A nice adjunct is that email
directed to a Domain name can be sent to your email site. For example, any character(s) can precede the "@" sign, followed by the Domain name, and the email
will get to your email service.
How to Copy Graphics and other files from Web Pages
You can right click on graphic images in other Web pages. However, beware of violating copyrights.
Copyrights
Any material copyrighted in a Web page is the sole property of the copyright owner. Many Web sites allow downloading of their contents and there are sites that
provide shareware and public domain graphic images and software. However, it is a good idea to consider the ownership if anything downloaded from another Web
site. Of course, you may copyright your own Web site, and that copyright will apply to items that can be copyrighted and that do not belong to someone else. Also
keep in mind that your Web site creations may belong to your employer. If there is an item on a copyrighted Web site that you would like to download, you may be
able to request and receive permission from the copyright owner.
Visitor Counters
Sometimes called "hit counters," these are handy for finding out how many times your site has been accessed. Some ISPs (Internet Service Providers), including
AOL, provide a counter for use by subscribers. A counter is usually written in CGI, Perl, or Java, and the count is actually stored on an ISP's server. A good site to
download a visitor counter is www.xoom.com.
Firewalls
Web servers usually have some kind of protective software that attempts to prevent threats from penetrating the servers. Firewalls usually require a password or
other identification before allowing access.
Antivirus software
Antivirus software isn't really a Firewall, but it checks incoming files to try to determine if they are infected by malicious software. World governments are beginning
to crack down on virus perpetrators. ISPs are now providing law enforcement agencies ways to track down the source of viruses.
Hot Spots (Imagemaps)
Imagemaps (Hot Spots) are invisible areas on a Web page that are linked to a URL. The best way to create them is to use Web page editing software such as
Microsoft FrontPage, Microsoft Visual InterDev, Cold Fusion, Hot Metal, etc.
Frames
Frames used to be used to provide a way to navigate directly to the various pages within a Web using a scrollable column. Generally, the frame on the left side
of the page had links to content that would be displayed in the right frame. Frames have been deprecated in the latest version of HTML and XHTML and someday
they won't be translated correctly by browsers. The new way to use columns is to have both of them scrollable, together.
HTML Quick Reference
The best way to find reference material is to search the Web using one of the many search engines.
Web search engines (short list)
hotbot.com
webcrawler.com
yahoo.com
excite.com
infoseek.com
iwon.com
ask.com
lycos.com
miningco.com
datamine.com
google.com
Web Applications
One way to create an application on the Web is to subscribe to a Web hosting site which will check out your programs before they run to make sure the site won't
be harmed by your software. Another option is to have an "intranet" site that consists of one or more servers owned by a company or individual (you, perhaps). The
Intranet server(s) may be connected to the Web.
There are thousands of Web applications, with more being developed each year. Some examples are:
Auctions (some for individual sellers - e.g. ebay.com; and others for auctioning an oversupply of products and services - e.g. priceline.com)
Bill payment (you are emailed when a bill is due, and you either authorize the billing service to pay the bills from your credit card account, or you manually pay them
to the billing service) (e.g. www.paypal.com)
Music (listening to music from a Web page; downloading music; and selling music, e.g. www.amazon.com)
Books (reading books online; purchasing books) (e.g. www.amazon.com)
News (many sites can select the desired type of news to fit the desires of each individual subscriber)
Equity market information and transactions (e.g. www.schwab.com)
Banking services (checking balances and transferring funds) (e.g. www.comfedbank.com)
Finding products and services to purchase (e.g. www.cnet.com)
Maps (e.g. www.mapquest.com)
Encyclopaedias
Search engine
"Shopping cart" applications
Web-based applications to replace legacy (traditional, non-internet) applications
A Survey of Simple Web Pages
www.barthstories.com
www.starattraction.com
www.nikonuniverse.com
www.basicwebinfo.com
Basic HTML commands
Basic HTML: Replace square brackets "[]" by pointy brackets.
[html]
[head]
[title]
This is the title.
[/title]
[/head]
[body] [center]
This is some text.
[/center]
[strong]
This is bold text.
[/strong]
[!-- This is a comment --]
[a href=http://www.acbpm.com]
[img src="http://user.aol.com/champplace/photos/me.jpg]
[br]
[hr]
[p]
[/body]
[/html]
Meta Tags
Meta tags provide information for search engines to more easily determine what your page is all about. When a Web page is found by a search engine, meta tags
help it correctly catalog the page so that when someone uses the search engine to find a subject that fits the parameters in the Meta Tags, then that web page
address is displayed. See some of the Web page examples in addition to the below example. Meta Tags reside in the heading, so only the heading portion of a
Web page is shown in this example. Replace square brackets "[]" by pointy brackets.
[!DOCTYPE HTML PUBLIC "US Lighthouse Society"]
[html]
[head]
[title]US Lighthouse Society[/title]
[META NAME="Author" CONTENT="David V. Barth"]
[META HTTP-EQUIV="Content-language" CONTENT="en-US"]
[LINK REV=made href="mailto:dvbarth@aol.com"]
[META NAME="rating" CONTENT="General"]
[META NAME="ROBOTS" CONTENT="NOINDEX, FOLLOW"]
[META NAME="Description" CONTENT="Copyright (C) US Lighthouse Society"]
[META NAME="KeyWords" CONTENT="light house,lighthouse,lantern room,us lighthouse society,lighthousesociety"]
[/head]
A Logical Learning Progression for Java
A logical way to learn Java is to learn C programming, a procedural language where instructions are processed in a definite sequence. That provides the basic
syntax for C++ and Java. Then learn C++ which is similar to C, but is an object oriented language. Java is the final step, and it is similar to C++.
eCommerce Application Life Cycle
An application life cycle is the phases that an application goes through during its life. The life of an application may be short, lasting less than a year, or it may
be in use for many years.
An eCommerce application life cycle might look like this:
Determine the goal of the application;
Draw up a plan for implementing it;
Decide on the hardware required to run it;
Obtain the communications necessary to provide content to the Web;
Build a modem pool;
Obtain and set up servers (PC, Unix, or Mainframe);
Define a file backup philosophy;
Set up security for transactions;
Set up security for the servers such as firewalls;
Decide on what levels of fault-tolerance are to be used such as disk mirroring;
Do the application development;
Test the application and make corrections, as necessary;
Place the application into production;
Maintain the application;
Provide customer support;
Enhance the application, as needed
ASP (Active Server Pages)
Active Server Pages are the basis for interactive Web pages. ASP is where the Web derives its tremendous power and functionality. For additional information,
see devasp.com.
Glossary
ActiveX - A Microsoft technology for embedding components into Web pages.
applet - A Java program designed to be embedded into a Web page.
argument - A parameter that is passed to a called function.
array - A like set of variables that can be referenced by an index.
Boolean - A variable that can store only "true" and "false."
interpreter - The browser component that converts a script from English to computer language.
Java - An object-oriented language for Web pages, developed by Sun Microsystems.
JavaScript - A Java-based language for Web pages, developed by Netscape. It is not as capable as Java, but is easier to implement.
method - A special function that can be stored in an object and acts on the object's properties.
object - A variable that can store multiple values, called properties, and functions, called methods.
scope - The part of a JavaScript program that a variable was declared in and is available to.
tag - A word or group of words used in HTML, surrounded by greater-than and less-than signs, for controlling the way a Web page is displayed. For example,
to bold the word "car," the tags used could be written as: car.
VBScript - A scripting language developed by Microsoft, based on Visual Basic. VBScript is only supported by Microsoft Internet Explorer, not Netscape
Navigator.
Glossary of terms (buzzwords) used in eCommerce presentations.
Change Agent - Something that causes change.
Transitioned Roles - Jobs that changed.
Achieve Project Definition - To Define a Project.
Operational Consolidation - To combine functions.
Drive Change - To cause an evolution in methods.
End-Game - The desired result of an action.
Holistic Approach - Overall plan that takes a wide view.
Macro Basis - An underlying concept that considers most issues.
Vision of Future - An idea or ideas that are desirable.
Strategically Aligned Governance - A multilateral agreement between leaders to achieve long-term goals.
Strategic Alignment - A long-term multilateral agreement.
Run Metrics - Measurements for actions to achieve a desired result.
Deliver Value - To do work that achieves desired goals for the client.
Net Centric Skills - Skills necessary to achieve the desired goals.
"Skin in the Game" - An investment by a party that encourages it to push for success.
Peer-Client Relationship - An engagement where the contractor and the client have an investment in the venture.
Operational Consolidation - Multilateral merging of expertise to achieve a goal.
Strategy-Led Perspective - A viewpoint based on a long-term goal.
Data Points - The results of tracking achievement over time.
WAP (Wireless Application Protocol) - A methodology for connecting to the Internet without using a hard wire.
Regulator Paradigm - The attitudes, issues, and concerns of a government body that has authority over the parties.
Universal Access - Entry capability for the majority of persons or parties.
Consumer Aggregators - Entities that provide a wide range of services to end-users.
Enterprise Aggregators - Entities that provide a wide range of services to companies or agencies.
Heritage Voice - Traditional analog talk over twisted pair or plain old telephone system (POTS).
Heritage Data - Traditional analog information transmission over twisted pair or plain old telephone system (POTS).
Revenue-Sharing Models - Methodologies that describe how two or more cooperating parties divide income between each other.
Decoupling of Information - Separating a methodology from the data it transmits.
Bill Presentment - Sending an invoice to a party.
Dynamic Pricing - Cost of services or products that is easily changed.
Architecture Direction - Design trend.
Event-Driven Architecture - An object-oriented approach to application development.
Net-Centric Architecture - Architecture used to achieve a desired goal.
Collaborative Leadership - Management by two or more parties.