Abstract: The World-Wide Web is the most talked-about distributed information system today. This paper does not touch on its workings; it tries to give a brief history and outlines the feelings provoked by the explosive adoption in all circles of WWW as the first vehicle on the Global Information Infrastructure.

Keywords: WWW, World-Wide Web, History, SGML, cultural Aspects, Society.

Category: H.5.1

1 Introduction

Before the World-Wide Web, networked information was difficult to access. With the Web, browsing through distant data bases has become almost a recreational pleasure.

It is fair to say that the Web is now driving the Internet. In fact many recent articles in newspapers and magazines simply make no distinction between the Internet and the World-Wide Web: it is as if the roads had been lying there for some time, waiting for someone to invent the Volkswagen.

I have followed the development of the Web from the days before it had a name. A brief history is therefore in order.

2 Brief History

In 1989, Tim Berners-Lee and I proposed independently a project for studying hypertexts and their possible uses at CERN. We joined efforts quickly, and Tim had already a prototype and a set of ideas to use the hypertext paradigm over the network. In this, he was no doubt influenced by the earlier work of Ted Nelson (Xanadu, [Nelson 88]).

In 1990, Tim implemented the first browser/editor under the NeXTStep operating system. This was easily possible, since the NeXTStep system came with an object- oriented development kit which included not only a graphical interface builder (itself wysiwyg!) but also a programmable text editing object with paragraph styles. We were off the ground at least as far as we ourselves were concerned: the browser/- editor as a single tool for both navigating and correcting, editing, composing texts and

Page 221

hypertexts was a real dream object. To this date, the easy of use of that program has not been surpassed in the WWW world.

In 1991, Nicola Pellow, a technical student at CERN, wrote the Line Mode Browser. This was a simple, character-grid oriented client which was written in C (not even ANSI, but just flat C!). It could be compiled on just about anything but the kitchen sink. Through its availability, the Web began to spread outside CERN.

In 1992, the first steps were taken to implement format 'negotiation', i.e. a mechanism whereby a server and a client could agree on the format of the document to be transmitted. This was also the year of the integration of all the other useful and existing protocols on the Internet: Gopher, ftp, telnet etc.

In 1993, Marc Andreessen, then a graduate student at NCSA, produced an X-window browser. Though I would term it primitive compared to the elegance and functionality of the NeXTStep browser, it had the marketing advantage of permitting colour images to be included. This sudden availability of colour pictures and proportional type fonts to the grey world of Unix gave the Web a boost it had not derived from anything else. Pictures were clearly the means to capture the imagination of the manager of your manager. The Internet programming community went wild. Mosaic became the synonym of WWW.

At the end of 1993, I decided that it was time to have all the early contributors meet each other in a great brainstorming session: I planned the first WWW Conference, which was held at CERN in May 1994.

1994 can truly be called the 'Year of the Web'. It became clear that CERN could no longer continue core development without external help, and a project was submitted to the European Community to fund a transitional phase of Web development in Europe. This project has a partner in the US. Its major aim is to ensure that there is a single, open standard in the Web mark-up and the Web communications protocol, based on a working reference implementation which is freely available.

The Web is now mentioned in any magazine at any time. Newspapers tell you how to connect to the Internet, URLs are routinely used in scientific journals to refer to information, they proliferate on the teletext pages of MTV, in short, you can no longer get away from it.

3 A successful System

To be a global success, a system must have two basic properties:

- it must have a small learning threshold so many people will join in,

- it must be sufficiently scaleable to stand up after large numbers of users and publishers have in fact joined.

The Web satisfies these conditions because:

- it defines an easy to understand name space for documents which is open-ended and addresses synthetic documents thus allowing interfacing to data bases and systems generating documents.

Page 222

- it works over the Internet, making it global and accessible to a large community of programmers,

- early HTML was easy to generate, so populating the Web with information from existing data bases could be done through simple server interfaces.

- separation of form and content allows documents to be shipped without worrying about the capabilities of the client, making the web easily portable to all platforms.

- it is also easy to populate the Web because servers can be set up without prior consultation with previous publishers.

- the name space is based on the Internet naming scheme, therefore the Web scales like the Net.

- there are only fleeting connections, so servers can handle requests serially.

But being a success does not mean you are better than others or even merely good. The Web has also well-known disadvantages:

- it is not easy to write browser/editors, which are more difficult to make than word processors.

- it is not easy to find information because indexes do not scale.

- it is easy to get lost.

- it is not possible to control the quality or authenticity of the information, leading to a social problem.

- being open and easy to add to, the danger of divergence into incompatible systems is great.

In any case, to ensure the future, we must:

- maintain interoperability,

- have open standards,

- keep systems mutatable,

- produce interfaces that 'can be understanded by them people'

Some of these points will become clear later.

4 Providing Information

4.1 SGML and Layout

For the context of the points following, it is necessary to understand something of the SGML philosophy and to remove a number of popular misconceptions.

HTML is not a subset of SGML.

Page 223

What is today called HTML should in fact have been named 'the Web DTD,' For the sake of those who do not know all about SGML, I'll briefly describe its most important features. SGML is not a document format, it is a system for describing structures. It starts from the idea that there are sets of documents that look alike (or should look alike) in structure. For example, all novels are divided into elements called chapters, and each chapter is a sequence of paragraphs.

Elements of a document are marked-up by putting tags at the appropriate places in the text.

The structure of each set of documents can be described by a Document Type Definition (or DTD) which is a formal grammar about its elements.

A DTD defines only the structure, but not the presentation. The presentation of each element is given in a separate object, a so-called style sheet.

SGML then allows for complete separation of structure and presentation.

Now, if a person wants to communicate a document to you, you need in total three objects:

- the document itself, with the mark-up in it (tags),

- the DTD, which tells you what tags mean and how they relate to each other,

- the style sheet, which tells you how to present each element.

The way in which WWW uses SGML is (slightly simplified):

- the Web DTD is agreed beforehand by everyone (and called HTML),

- the presentation is left to the browser (client) so each individual can set it to his/her liking.

- the document only is shipped from server to client.

We are now limited to HTML, since only the document is transferred. However, because there is no objection to ship a complete MIME message from the server, including a DTD and a style sheet, there is room for a complete implementation of distributed SGML systems using the Web.

Keeping presentation separate from contents has enormous advantages:

- automated treatment of information is easy.

- no knowledge is needed of the presentation capabilities of the client.

It is far easier to look for author names when these are tagged properly in simple text documents than when they are embedded in proprietary formats using just text styles.

Thus, HTML tags like <b> for bold and <i> for italic are really meaningless and should be avoided.

The world is alas divided into those who want to supply information and those who want to supply advertising. Information may be given to humans or to their

Page 224

computerised agents. Advertising is normally done by visual impact directly to a receiving human, which requires complete control over layout. The image is omnipresent on the Web, showing that the majority of servers are actually providers of advertising.

4.2 The Paper Metaphor

We are learning how to provide information in the new medium of distributed hypertext. This early phase is dangerous: if we are not bold enough, we may never escape from the paper and book metaphor.

So far, a lot of existing stuff has been made available. This information was destined to be presented on paper. Existing methods and habits for preparing it are all more or less related to text processing for the printed page.

The microcomputers in use today are practically all employed for such purposes: printed reports. The Web does not need printing. It is useless to print a well-conceived hypertext that is kept up-to-date. A paper copy leaves me uneasy: is it still valid? Yet the most frequent question that novice users of the Web ask is 'can I print this?'. Yes, you can, but why?

For the same reason of attachment to the printed page, we have seen floods of converters from proprietary formats into HTML, but not a serious browser/editor and (to my knowledge) only a single rudimentary tool to assemble a number of Web documents into a larger one for printing.

Everyone seems to see the printed, word-processor document as the original, and the Web document as the derivative. This is bad news.

4.3 Existing Information

Existing information of course has to be published somehow, so converters are not completely undesirable. There are four courses of action to get existing documents published on the Web:

- do nothing at all. This is the easiest method and obviously consumes no resources but also achieves nothing.

- publish references. Here the Web user will find at least that the document exists, and perhaps a way of getting a copy somehow.

- publish as-is. Now some Web users can see the document (if they have an application that is capable of handling the proprietary format after transferring the document file). Others will see only the reference but will not be able to follow the link.

- convert to Web structures. This is best, but consumes most resources, since links have to be put in, redundancies have to be removed, and chunking (dividing the original up into reasonable hypertext portions) is not always obvious.

4.4 New Stuff

Page 225

For the new stuff, we should think of the far future. The Web is not the ultimate repository of human knowledge. But we want to keep the documents somehow, what they mean and what they have to say. Thus we should think: 'how are we going to read this a hundred years from now?' And I'll bet it will not be on an Intel-based PC. So we had better make sure that the semantics are preserved, and that their format is easy to convert from current systems to the next ones. We must make the contents mutatable. For me this is one of the more important reasons to use SGML-like encoding and to encourage authors to use it correctly.

5 Tools

5.1 Collaborative Tools

Currently we have a Read-Only Web. It is not possible to change a displayed document (unless you happen to use a copy of the NeXTStep browser), even if it belongs to you and you have all access rights to the file.

There are projects under way to change this situation, notably the GrifW3 which is based on wysiwyg SGML technology, but we are not there yet. In this respect, the Hyper-G system is far superior to the Web.

Once there is no longer a difference between a browser and a Web editor, people will be able to use the Web as a collaborative tool, developing information and documents together, independently of geographic location or time zones.

Note that an browser/editor gives you these advantages:

- no more problems with starting points, since the construction of my home page (see note at bottom of list) is now easy: it contains the pointers of the places I'm most interested in and perhaps some comments.

- hot lists are a thing of the past, since they arejust edited local HTML files, and I can impose any structure on any number of them.

- personal annotations and group annotations and annotation servers and the like are just all unified into sets of HTML files again.

- a lot of converter use goes away.

- pagination, the nightmare of traditional text processing is gone.

- bad HTML is never produced by a proper editor.

- some linking problems disappear (the editor can know about relative links).

- printing diminishes.

- you can organise your own notes in the same hypertext way as anything that you publish, and with drag-and-drop ease on your laptop.

(Note: a 'Home page' is where I as an idividual start from when I launch a browser on my computer. Its contents are nobody,s business. A 'Welcome page' is what I

Page 226

get from a server as the generic page when I specify only the server name in a URL. Currently, the term 'home page' is used for what should be called a 'welcome page'. The confusion comes from the fact that almost everyone gets a local welcome page when they start up a browser: the absence of editors makes it very difficult for non-expert users to build a home page.)

5.2 Rhetoric

Of course, a new rhetoric has to be learned. As today, some people will write badly for the Web, others will be brilliant authors.

5.3 Navigation

Anyone who has seriously tried to work the Web knows that it takes no effort to get lost. Systems like Hyper-G provide you with excellent navigation tools, Gopher is so structured that it is easy to know where you are.

The distributed nature of the Web makes it difficult to show the user a map, unless he/she is prepared to incur a lot of network traffic overhead.

5.4 Construction

Even with collaborative tools and browser/editors we will still need a variety of tools to help us construct the information. In word processors there are items like outliners, style checkers, even tools to help you organise your thoughts. For a distributed system, we need tools like that, but they must concentrate on other aspects: finding existing materials, generic referencing, suggesting which phrases should be linked to explanations, helping cut long parts up and rearranging and merging shorter ones. Plenty of room for research.

6 Social and Cultural aspects

We have in computing seen the negative effects of too many inventors in too many areas. Short-term commercial interests often force us to adopt computing solutions that are frustratingly complicated and that direct us into dead-end streets, holding back real change for decades. Networking has not been an exception. Too much attention is devoted to backwards compatibility (a term invented by software developers?). Technology should not be worried about the past. I prefer the attitude expressed in this maxim [De Bono 91]:

Instead of being pushed by our history,
we should be pulled by our vision.
E. De Bono

6.1 Change

Inside the network, we are rebuilding the world we know: we use the book metaphor, we want total layout control (which is a help for visual navigation, but definitely not for computer aided navigation!), we talk about adding a worn-out look to frequently

Page 227

consulted objects (do I want my welcome page to look worn out by complete strangers?).

Is it possible to think about what the networked society should look like or is it only possible to let a thousand weeds grow?

6.2 Isolation

Communication is different from the Gigabit/second number. There is a lot of 'communicating' being done, yet I know nobody who is happy receiving l00 e-mail messages a day. You feel obliged to answer (a vestige of the days of personal contact?) but what you get is the stress of having done so badly, huuriedly and tiredly. Increased isolation of individuals has been the result of increased exteriorisation of information. Books made it possible to know something without having to contact the author. Radio and TV spread news without personal contact. The Walkman effectively shields a person from investing in contact with others on public transport trips and certainly is used in this way. A recent article [xxx 94] proclaimed as an advantage of the global network that

'Sex, location of a partner, video marriage,...

You can have any kind of interaction without the inconvenience of having someone in your house'

The key word is 'inconvenience'. The lack of social contact is perhaps the most negative side of our Western culture, and is so perceived in most other cultures.

6.3 Life, the Universe and Everything

In an article of 1977, describing his graphical user interface, Alan Kay wrote [Kay 77]:

There are three reactions to the introduction of a new medium:

illiteracy,

literacy

and artistic creation

He goes on to say:

After reading material became available the illiterate were those who were left behind by the new medium. It was inevitable that afew creative individuals would use the written word to express inner thoughts and ideas. The most profound changes were brought about in the literate. They did not necessarily become better people or better members of.society, but they came to view the world in a way quite differentfrom the way they had viewed it before, with consequences that were difficult to predict or control.

How will the networked society influence the daily lives of normal people? Below are a number of questions. In the current social structures, I can think of negative answers only:

Page 228

- will employment go up?

- will people feel better?

- will general education improve?

- will the world be a safer place?

- what will happen to the service sector?

- will power be in the hands of benevolent people?

- is this what we want?

Is it therefore not more urgent to work on structural reforms in our culture rather than on laying down fibre cables or shooting 700 satellites into the sky?

6.4 Work

The argument against Malthus' predictions of overpopulation was that food production could be made to expand faster than the population growth. However, there really is a limit to the number of people that can live on earth, and only the very obstinate now hold that we have no population problem.

Likewise, an old economic argument has been that the introduction of machines will not take away work, just displace it to other activities. So we have seen a massive movement from agriculture to industry and then to services. But the computer is not a machine like any used in the industrial revolution and after: it displaces people.

Consider the distribution of work. Before 1900, there was lots of work, despite mechanisation. Jobs of low and high intellectual content were plenty. There was even a healthy overlap in the middle.

During this century, we have witnessed a constantly growing separation: jobs are either menial or intellectual, and there are fewer of them. They are less gratifying. Menial jobs are done by immigrant workers, intellectual jobs need high qualification which not many can attain. The young generation is acutely aware of the 'No Future' syndrome.

With the massive introduction of computing, fewer and fewer real jobs remain. The service sector which absorbed people during the boom years no longer does: those who are unemployed remain so. The recovery of the economy does not result in higher employment.

Networking makes the problem more acute: when you can learn immediately from the best in the world, why go anywhere local?

Will the networked society be able to create more jobs or will it just lead to much more efficient service companies, leading to higher unemployment? Maybe the time has finally come to start working less and less hard.

7 Europe

I was once told that 'Europe has a cultural deficit in networking'. Brilliantly and concisely expressed. I set out to calculate the value of the deficit. as a simple

Page 229

expression: the number of networks in use per million inhabitants. The Internet statistics give numbers of networks, an encyclopaedia numbers of people. The graph below needs no comment. One could argue whether I now like this situation or not, given what I wrote before.

Europe has many assets, and I am attached to this strange assemblage of peninsulas. But we have one big problem: reacting speedily and nimbly to situations which need mobilisation of large resources. We seem to be poor at exploiting ideas that we generate here, especially in high-tech. Will Europe play a real role in the global networked society or will we have to buy everything from a US software company, even though we keep generating the important ideas? Three axes are important for any society that wants to partake of the coming network culture [Abramatic 95]:

- core technology, (you need to understand the options),

- tools (their use determines your competitive efficiency),

- content published (your visibility in the marketplace).

The content of a network server in WWW is what makes people look at it. We can make a big impact there and remain at level with other parts of the world.

The making of the content is dependent on the tools you use. If Europe makes no good tools to support its diverse cultures, then we will have to wait until someone else supplies us with them. Or doesn,t.

Tool construction in turn depends on intimate knowledge of the core technology. It is not only important that the WWW standards remain open. It is even more important for us here that Europe maintains a strong core development effort so that our

Page 230

computer scientists can work locally on projects that will give them the necessary expertise. Our continued presence in the tools and contents areas rests on this expertise.

8 Brief Future & Conclusions

Networks will keep up with the traffic: speed will go up, costs will come down. There will, like on normal motorways, be traffic jams here and there. But on the whole, things will keep pace.

Collaborative work will take off especially in research, where it traditionally has been, but also inside big companies.

A new rhetoric will develop, as compelling as that of advertising. This may distort our perception of reality.

The networked society will probably fall apart not just into rich and poor, but into rich informed and poor information-illiterates.

The Web is fast becoming an entertainment medium. Perhaps Andreessen,s ideas of abrowser already contained the seeds of the Web entertainment business. But for me, a member of the minority who want to stay in real reality, the questions for humanity are:

- do we want entertainment or collaborative tools?

- do we want to drown in multimedia sense overload or do we want text searches?

- do we want proprietary encryption or trusted carriers?

-... or...?

Maybe I'm just old-fashioned...

References

[Nelson 88] Theodore H. Nelson: 'Literary Machines', The Mindful Press, 1988

[De Bono 91] E. De Bono: 'I am right, You are wrong', Penguin, 1991.

[xxx 94] Airline in-flight journal, November 1994 (?)

[Abramatic 95] J.F. Abramatic, INRIA, private communication.

[Kay 77] Alan C. Kay: Microelectronics and the Personal Computer, Scientific American, September 1977, p.231

Page 231