Data – Information – Knowledge

Who doesn’t know the classic distinction between data, information, and knowledge? And who hasn’t seen at least one version of the famous pyramid with data on the bottom, information making up the next level and knowledge making up the layer above that with the peak consisting of wisdom? There are several assumptions and implication of this model of data, information, and knowledge: First, it is assumed that they are qualitatively and quantitively different. Data, for example, is different from information and there is more data than information just as there is more information than knowledge with wisdom being the rarest sort of knowledge. Second, it is assumed that they are hierarchically interdependent, that is, you cannot have information without data or knowledge without information, but you could have data without information and information without knowledge. Third, the hierarchy implies a value judgement, that is, data is not as valuable as information and information is not as valuable as knowledge, and of course, wisdom is the most valuable of all. Finally, the hierarchy also implies a kind of temporal or ontological priority. Since information depends on data, data must come first, and since knowledge depends on information, information comes before knowledge, at least temporally. This means that first we have data, then we somehow construct information out of data, and then we can go on to construct knowledge out of information. Data is something like the raw material out of which is constructed information and information is the raw material out of which knowledge is constructed. There is nothing in the model that implies how this construction process works. The model itself does not tell us where data comes from or how exactly information is constructed out of data or knowledge out of information. In order to answer these questions, we are left to speculation.

There is of course a kind of consensus among interpreters that there are also different kinds of construction. In short, these may be termed “transcription,” “cognition,” and “praxis.” Data are said to be constructed by means of some kind of transcription, that is, something is preserved, fixed in some material form, in some medium, whether it be sound, text, or pictures. Today, data is above all transcribed into bits and bytes, that is, into digital media, which, as the dominant media in today’s world, also determine what is usually meant by the term “data.” Data are just bits and bytes, 1s and 0s, electronically fixed upon some memory medium. Information is usually thought to be constructed out of data. When the otherwise meaningless bits and bytes are combined into signs in a language and are given meaning, then data becomes information. This is above all a cognitive process. Somebody makes “sense” of the images or marks on paper or the bits and bytes on the chip via a cognitive process of “reading.” This is information. But it is not yet knowledge. Knowledge is what information becomes when it used practically to solve some specific problem. The practical use of information in problem-solving activities is called “praxis.” It is in praxis that mere information, for example, mere theory or mere textbook knowledge, becomes situated in a particular context in the real world. It is through praxis that we know what information is good for, what it can do, how it can be used in complex situations. Knowledge is knowing by doing. This separates the apprentice from the master, the inexperienced from the experienced. It is the experienced master who alone can be said to possess knowledge.

It is often explicitly or implicitly admitted that learning, that is, the construction of new knowledge, only occurs in praxis. It is only when information is used to solve a problem, when there is contact with real circumstances, unexpected events, unforeseen results, etc. that new knowledge comes into being. This is learning and it is experience. Learning, it is said, comes by doing. But experience is only the beginning, what is learned must be transcribed, made explicit in some medium, in order to create knowledge. That is, praxis leads to transcription of new data which leads to new information and thus to new knowledge. This is a circular process, a never-ending cycle. At this point a difficulty with the hierarchical model of data, information, and knowledge emerges. A circle is not a pyramid. There is no hierarchy, no privileged position, no value judgement implicit in a circle. There is no temporal or ontological priority. Everything is on the same level and everything is equally prior to everything else. In short, there are no linear dependencies, no beginning and no end, no top down or bottom up. This makes the traditional pyramid model of the relations between data, information, and knowledge appear arbitrary and unfounded. If data comes from praxis, then why should one begin with data and not with knowledge? Why does praxis only come last in the construction of knowledge, when in fact, it comes first? Is not transcription also a kind of praxis? Is not making meaning out of data also a praxis? Is data, in the digital sense of the word, even possible without meaning and the praxis of transcription? And does not this praxis also require knowledge as well as create knowledge? It would seem that the typical assumptions we make about the relations between data, information, and knowledge are not well-founded and perhaps seriously misleading.

Perhaps instead of beginning with data, we should begin with praxis and understand data and information as certain kinds of knowledge bound up with certain kinds of praxis. The distinction between data, information, and knowledge is therefore neither qualitative, nor quantitative, nor hierarchical, nor based on value judgements. Data is not something that has less value than information, since it is itself a kind of information as well as a kind of transcription praxis. Learning takes place in the transcription of data as well as everywhere else where praxis does what it does. But what exactly does praxis do? To begin with, praxis does not introduce order and structure into a disordered and undifferentiated “manifold” – in Kant’s terminology – of data. Praxis instead begins with a problem to be solved. The problem, in order to present itself as such at all, is already in some way understood: We have to hunt animals in order to eat. We have to build a shelter to protect us from the elements. We have to communicate with others and coordinate our activities in order to achieve goals that we could not achieve alone. It is in the course of doing these things that we learn how to do them better, including developing digital technologies. The path to knowledge does not begin with data, but with praxis. Tens of thousands of years of praxis has led to using bits and bytes, computers, and digital networks instead of stone axes in order to make praxis more effective. It would therefore appear that praxis does not reduce complexity, but increases it. It does not construct closed systems, but connects everything to everything, adding by means of learning always new elements, new nodes in the network. Data and information are currently important nodes in digital networks, but they are not what the network is founded upon. They do not build some kind of basis upon which knowledge is constructed. The network is founded upon nothing outside itself, that is, nothing prior to its own activity of making connections, discovering new relations, and binding more and more things into itself. From this perspective, the traditional distinctions between data, information, and knowledge seem questionable. Perhaps new thinking is needed in order to understand what these terms which have become so important in today’s world really mean.