MMC Internet ServicesAbout MMCCorporate MeetingsMMCISmultimedia

The Internet and the World Wide Web

Charles A. Cornell, Multi-Media Communications Internet Services

What is the Internet

Before you can understand what the World Wide Web is, you have to understand what the Internet is, since the World Wide Web is a subset of the Internet.

The Internet is a world-wide network of networks with gateways linking organizations around the world. The organizations are administratively independent from one another. There is no central, worldwide, technical control point. Yet, working together these organizations have created what to a user seems to be a single virtual network that spans the globe.

On one level, the Internet appears to be a single switched network that allows any computer attached to it to establish communications with any other computer attached to it. The analogy that I like best is to compare the Internet with the telephone system world wide. The telephone system consists of many different networks that have worked out how to cooperate with each other. This allows you to place a call to your local telephone company and have it handed off to a long distance company then perhaps to a foreign telephone agency before reaching the party you called. You don't have to worry about any of it because there are standards that are adhered to by all the telephone networks participating in the world wide telephone system.

The physical circuits that make this possible are provided by government, educational, and commercial organizations. More and more, Internet data is carried over the same networks that carry telephone data. Each of the independent networks is managed by its own administration. There is no one organization called "The Internet". Contrary to popular understanding, the Internet is not free. Each of the network organizations The networks all use a common suite of networking protocols, TCP/IP, which stands for Transmission Control Protocol/Internet Protocol. It is because of this commonality of protocols, this commonality of network functionality, and interoperability that the networks provide what appears to be a seamless, integrated virtual network, regardless of the heterogeneity of the underlying computer hardware or communications transport .

How Do I Connect to the Internet

Up until a few years ago, only organizations such as universities and large companies had direct connections to the Internet. Now it is increasingly common for individuals to connect their personal computers to the Internet by signing up with a commercial Internet access provider. These commercial access vendors provide dial in modems that allow a user to use the standard telephone connection to dial into the access provider's modems. The access provider then assigns an IP address from a block assigned to them, thus connecting the user's system to the Internet. Most commercial, government, and educational organizations have dedicated line connections to their Internet access provider. These connections range from low speed 56K or 28.8K BPS dial up modems to 1.5M BPS, and higher over dedicated line.

TCP/IP protocol

All of the computers attached to the Internet use the TCP/IP protocol to send messages between various application programs. You really don't need to know what TCP/IP is to use it, any more than you need to know what the telephone switching protocols are to make a telephone call. The TCP part of TCP/IP defines how packets of information are organized so that they can be safely transported over the physical networks. If you want to ship something by truck and the thing is bigger than a truck, you have to disassemble it, and then provide instructions so that whoever you are sending it to can put it back together correctly. That is what TCP does to the data going over the Internet. It defines how it is to be broken up into standard sized data packets, and how those packets are to be put back together on the receiving end.

To be attached to the Internet, a system must have an IP address. To go back to our telephone analogy, IP addresses are like the main number for a company or organization. IP addresses are 32 bit numbers that are written as four numbers between 0-255 separated by periods. For example, 127.0.0.1, 199.173.190.2 are IP addresses. To be connected to the Internet, an organization must apply for a registered IP address from a central agency called the Internet Address Naming Authority (IANA). This address is the most significant part of the IP address of all machines in the organization network accessible by the Internet. The organization picks a human understandable name called the Domain name which is registered with IANA so it can be associated with the IP address. The domain name ends with a suffix which indicates the type of organization or its geographic location for international organizations. The following are some examples of domain names:

Typically, the domain name is associated with the first 2 or 3 numbers in the IP address which allows the organization to assign individual system addresses in its domain with the last 2 or 1 numbers. So if a company is assigned the domain name abc.com which has associated with it a Type C IP address of 111.222.333 then it can assign 255 systems their own IP addresses using the last number. So valid IP addresses in that domain would be 111.222.333.1, 111.222.333.2, through 111.222.333.255.

Domain Name Service

Just as the telephone system provides directory services so that you can find out what the telephone number of a person or company, the Internet provides directory services that return the IP address for machines in registered domains. Each Internet access provider is responsible for maintaining Domain Name service information for all the addresses assigned to it.

What Can I Do On the Internet

The last piece of the picture we need to talk about before we can actually use the Internet is the idea of clients and servers. The TCP/IP protocols make it possible to electronically connect any two systems on the Internet, but for them to do something useful with that connection, they must have cooperating applications running on them. On the telephone system I can have successful transactions between two people using telephones or two people using a fax machine. But a person on a telephone talking to a fax machine is usually not successful. So it is on the Internet. A successful transaction on the Internet depends on the system requesting information (the client) using a compatible program with the system delivering the information (the server).

Session Protocols

Between the low level packet and transport protocols (TCP/IP) and the high level application protocols discussed below there are a few standard session protocols that are supported by clients and servers that are shipped with most implementations of TCP/IP software. These include:

Internet Applications

While there are many ways to use the Internet today, and many more to be discovered or invented in the future, the following are the primary ways that people use the Internet today.

mail

You can use various editors and mail front end programs to send a email message to anyone who has an account on any system that is registered on the Internet. There are people on the Internet who maintain mailing lists of users who are interested in a particular subject so that items of interest to that group can be mailed to everyone by using the list alias as the addressee on an email message.

news

News is a method for collecting and disseminating messages (called posts) that are grouped into a hierarchy of subject interests. Each special interest news group must have a system that acts as the collector of news items for that group. People post items of interest to the news group by using a news reader to send their message (via the Internet address of the collector) to collector system. The collector system periodically broadcasts the latest items for that news group to systems that are designated as news servers. Most Internet access providers provide news servers.

Internet subscribers specify what news groups they want to receive. All the items in those news groups are transferred as files in a hierarchy of directories that maps the special interest groups. Individual users can then use one of the publicly available news reader programs to select which of the locally available groups they want to read. Most readers also allow a user to reply to a post, thus creating a news thread which is a series of posts, replies to posts, replies to replies, etc., on a particular subject in the special interest group. News groups are named by a series of hierarchical nodes separated by periods (.). A group that contains discussion of Silicon Graphics hardware related stuff is comp.sys.sgi.hardware, which is pronounced comp dot sys dot sgi dot hardware. News groups range from fairly serious and technical ones like comp.os.ms-windows.programmer.win32 to the humorous like rec.humor.funny to the frivolous like alt.silly-group.radish-therapy.

electronic distribution

There are many systems on the Internet that maintain file servers called ftp servers. These systems allow a user to use the ftp command to login to the system as a user named anonymous. The user can then download (or upload as permitted by the server system) files from the server. These files may be textual material, computer programs, or multimedia documents. An indexing protocol that uses a client called gopher is often used to index files that are on ftp servers. World Wide Web servers are rapidly replacing the rudimentary ftp servers as the protocol used to access and index these files. There are literally hundreds of thousands of files that can be accessed this way. Some of the available servers are:

electronic publishing

In addition to many academic journals which are made available on the Internet via gopher, ftp, or WWW servers, there is an ever growing number of commercial publishing efforts getting started. All of these new efforts use World Wide Web technology.

Finally, What is the Web?

The World Wide Web (from here on called simply the Web), is that part of the Internet using a session protocol called http (HyperText Transport Protocol). The Web merges the techniques of networked information and hypertext to make an easy but powerful global information system. The project represents any information accessible over the network as part of a seamless hypertext information space.

The Web was originally developed CERN (the European High Energy Research Center in Geneva) to allow information sharing within internationally dispersed teams, and the dissemination of information by support groups. Originally aimed at the High Energy Physics community, it has has become the "killer app" of the Internet, spurring explosive growth. It is currently the most advanced information system deployed on the Internet, and embraces within its data model most information in previous networked information systems. In fact, the Web is an architecture which will also embrace any future advances in technology, including new networks, protocols, object types and data formats. Clients (information readers or browsers) and server (information suppliers) applications for many platforms exist and are under continual development.

The Information Reader's View

The Web consists of documents, and indexes. Indexes are special documents which, rather than being read, may be searched. The result of such a search is another ("virtual") document containing links to the documents found. A simple protocol (HyperText Transport Protocol or HTTP) is used to allow a browser program to request a keyword search by a remote information server. There are many vendors providing Web browsers. The most popular are Netscape Navigator by Netscape Communications and Internet Explorer by Microsoft.

The Web contains documents in many formats. The most effective type of documents are called hypermedia documents which contain links to other documents, or places within documents. All documents, whether real, virtual or indexes, look similar to the reader and are contained within the same addressing scheme. Documents can contain graphic images, sounds, full motion video segments, as well as text. To follow a link, a reader clicks with a mouse. To search an index, a reader gives keywords (or other search criteria). These are the only operations necessary to access the entire world of data. In addition to the display of hypermedia data, the Web clients support on-line forms which can transmit data from the client to a server, making the Web an interactive medium.

The Information Supplier's View

To provide information to the Web, you must create the hypermedia documents, any supporting indices, and then install Web server software on a machine connected to the Internet. More sophisticated Web servers provide database support for both serving information and collecting information from forms filled in by clients. The Web model gets over the frustrating incompatibilities of data format between suppliers and reader by allowing negotiation of format between a smart browser and a smart server. This provides a basis for extension into multimedia, and allows those who share application standards to make full use of them across the Web. Hypermedia documents on the Web are formatted using a special markup language called HTML for HyperText Markup Language. An HTML document is a simple ASCII file that contains the plain text information along with special markup "tags" that tell the browser how to format the text and where to put what graphics, audio, video, or other media information. There are hundreds of tools for producing HTML documents from existing documents in word processing formats, as well as for authoring new documents directly. The most important markup tag in terms of the power of hypermedia documents is a link that associates another document with part of the text or graphics in a document. An HTML link in a document can point to a document anywhere on the Web.

As with Web browsers, there are many vendors offering Web server software. The most popular are the freeware Apache Server which is used by many ISPs, Internet Information Service (IIS) from Microsoft, Netscape's Enterprise Servers, and O'Reilly' WebSite for NT.

How Do I Point To Information On the Web?

Information on the Web is addressed using a Uniform Resource Locator, or URL. A URL contains a protocol specification, the name of the system containing the information, and the pathname of the file on that server system. The URL http://www.mmcis.com/HomePage.html specifies to use http protocol to access a file named HomePage.html on the system named www.mmcis.com. The URL ftp://ftp.microsoft.com/deskapps/word/winword-public/ia/wordia.exe specifies to transfer the file wordia.exe in the directory path /deskapps/word/winword-public/ia/ on the system ftp.microsoft.com using the FTP protocol.

For the most part, readers never even see a URL. They simply click on specially marked text or graphics in an HTML document and the browser generates the URL that is linked to that specially marked text or graphic. The convention for servers on the Web is to name the main Web server system www.domain.name, and to provide a default menu so that a simple URL like

http://www.mmcis.com,
or
http://www.whitehouse.gov

will give a reader access to the top of the document tree stored on that organization's Web servers. Most Web browsers also ship with built in links to several different indexing servers on the Internet so that even the most inexperienced net surfer can plunge in to the Web without a steep learning curve.

What Next?

If you have read this far, you obviously find this Web thing fascinating. The next thing you need to do is to get yourself an Internet connection and start exploring the Web. To find out how World Wide Web technology can be a strategic tool for any organization that has to publish information to a large number of people, or that wants to provide an interactive medium with a dispersed group of people, contact Multi-Media Communications. Email us at cac@mmcis.com, leave us an action request on our Web site, or even call us at (508) 653-3392 and ask for Charlie Cornell, and we will be glad to show you how to use the Web to your advantage.

Charles Cornell (cac@mmcis.com),, Multi-Media Communications Internet Services

MMCinfoNewGallerynewsPresentations