• Computer data or information storage, often
called storage or memory, refers to computer
components, devices, and recording media that
retain digital data used for computing for some
interval of time.
• Computer data storage provides one of the core
functions of the modern computer, that of
• Different types of information is stored in
different ways, depending on what the
information is, how much storage space it
requires, and how quickly it needs to be
• This information is stored in its “short term”
memory and its “long term” memory.
• Your system memory (or RAM) holds information that you
are working with right now. This is the computer’s “short-
term memory”, and is designed to be able to feed
information to the processor at high speed so the
processor is not slowed down too much while waiting for it.
• However, this short-term memory disappears when the
computer is turned off. This is why you must always save a
file that you are working on before turning off the machine.
• Longer-term storage is provided by your hard disk drive,
floppy drive and other devices, where information is
stored permanently in the form of files, ready for you to
retrieve when you need it. When you want to use your
spreadsheet program, for example, the computer loads the
instructions that are stored on the hard disk that tell the
computer how to run it, from long-term storage (your hard
disk) into short-term memory.
• Every computer comes with a certain amount of Physical
memory, usually referred to as main memory or RAM.
• You can think of main memory as an array of boxes, each
of which can hold a single byte of information. A
computer that has 1 megabyte of memory, therefore, can
hold about 1 million bytes (or characters) of information.
• There are several different types of memory:
• RAM (random-access memory): This is the same as main
memory. When used by itself, the term RAMrefers to read
and write memory; that is, you can both write data into
RAM and read data from RAM.
• Thisis incontrast to ROM which permits you only to read
data. Most RAM i volatile,which means that it requires a
steady flow of electricity to maintain its contents. As soon
as the power is turned off, whatever data was in RAM is
• ROM (read-only memory):Computers almost always
contain a small amount of read-only memory that holds
instructions for starting up the computer. Unlike RAM,
ROM cannot be written to.
• PROM (programmable read-only memory): A PROM is a
memory chip on which you can store a program. But once
the PROM has been use ,you cannot wipe it clean and use
it to store something else. Like ROMs, PROMs are non-
• EPROM (erasable programmable read-only memory): An
EPROM is a special type of PROM that can be erased by
exposing it to ultraviolet light.
• EEPROM (electrically erasable programmable read-only
memory):An EEPROM is a special type of PROM that can be
erased by exposing it to an electrical charge.
PURPOSE OF STORAGE
• A digital computer represents data using the binary numeral
system. Text, numbers, pictures, audio, and nearly any other
form of information can be converted into a string of bits, or
binary digits, each of which has a value of 1 or 0.
• The most common unit of storage is the byte, equal to 8 bits.
A piece of information can be handled by any computer
whose storage space is large enough to accommodate the
binary representation of the piece of information, or simply
• For example, using eight million bits, or about one
megabyte, a typical computer could store a small novel.
• Without a significant amount of memory, a computer would
merely be able to perform fixed operations and immediately
output the result. It would have to be reconfigured to change
• This is acceptable for devices such as desk calculators or
simple digital signal processors. Von Neumann machines differ
in that they have a memory in which they store their
operating instructions and data.
• Such computers are more versatile in that they do not
need to have their hardware reconfigured for each new
program, but can simply be reprogrammed with new in-
memory instructions; they also tend to be simpler to
design, in that a relatively simple processor may keep
state between successive computations to build up
complex procedural results. Most modern computers are
von Neumann machines.
• In practice, almost all computers use a variety of memory
types, organized in a storage hierarchy around the CPU, as
a tradeoff between performance and cost.
• Generally, the lower a storage is in the hierarchy, the
lesser its bandwidth and the greater its access latency is
from the CPU.
• This traditional division of storage to primary, secondary,
tertiary and off-line storage is also guided by cost per bit.
TYPES OF STORAGE DEVICES
• Storage Devices are the data storage devices that
are used in the computers to store the data.
• The computer has many types of data storage
devices. Some of them can be classified as the
removable data Storage Devices and the others
as the non removable data Storage Devices.
• The data Storage Devices come in many sizes and
shape so altogether different. The storage devices
are one of the most important components of
the computer system.
• The memory is of two types; one is the primary
memory and the other one is the secondary
• The primary memory is the volatile memory and
the secondary memory is the non volatile
• The volatile memory is the kind of the memory
that is erasable and the non-volatile memory is
the one where in the contents cannot be erased.
• The secondary memory is used to store the data
permanently in the computer. The secondary
storage devices are usually as follows: hard disk
drives — this is the most common type of
storage device that is
• The other one include the floppy disk drives,the
CDROM,and the DVD ROM. The flash memory,
the USB data card etc.
• Thestoragedevicesareusedto recordthedataoverany
storagesurface.Thememoriesmay alsobeof different types
depending upon the architecture and the design like the optical
data storage memory, magnetic media storage and the mechanical
storage media etc and also the flash memory devices etc.
• The storage devices are actually defined as the peripheral unit
which holds the data like the tape, disk, or flash memory card etc.
The most of the drives that are used for the purpose of data storage
are fragile and the data can be easily corrupted in them. The data
storage devices are the ones that are also usedforthebackup
stused to be too costly and expensive. But these days the data
storage devices are becoming cheap day by day. Hence the data
storage devices price is falling. So, we are in a position to get a
storage device for a comparatively cheaper price than the earlier
drive. The technology is improving a lot and now the memory
storage capacity has gone up TB.
• The data in the storage devices can be in the form of the files, databases,
digital video and the audio etc.The storagedevices thatarecalled asthe
nonvolatile canstorethe datapermanently untilotherwise erased
purposely. This is in the case of the hard disk drives or the floppy disk
• The otherkinds ofthe storagemedia likefor examplethe CDand theDVD
caneven haveagain two types of the storage; the first one is that in which
the data once written cannot be erased. It is stored permanently over it.
While the second type of the CD’s or the DVD’s are called as the
rewritable; where inthedatathatisoncewrittencanbeerasedcompletely
andthesamestoragedevicecanbeusedagainfor storing the different data.
– FILE ORGANIZATION
• Studying the file organization is an important aspect of computer science
as all the information is stored on the hard disk organized in some
formats which are collectively known as files.
• Given that a file consists, generally speaking, of a collection of records, a
key element in file management is the way in which the records
themselves are organized inside the file, since this heavily affects system
performances as far as record finding and access is concerned.
• Note that by “organization” we refer here to the
logicalarrangement of the records in the file (their
ordering or, more generally, the presence of “closeness”
relations between them based on their content), and not
instead to the physical layout of the file as stored on a
storage media. File organization is the methodology which
is applied to structured computer files.
• Here we will look at two components of file organization:
• The way the internal file structure is arranged and
• The external file as it Is presented to the O/S or program
that calls it. Here we will also examine the concept of file
• We will examine various ways that files can be stored and
organized. Files are presented to the application as a
stream of bytes and then an EOF (end of file) condition.
• A program that uses a file needs to know the structure of
the file and needs to interpret its contents.
Internal File Structure
Methods and Design Paradigm
• It is a high-level design decision to specify a system of file
organization for a computer software program or a computer
system designed for a particular purpose.
• Performance is high on the list of priorities for this design
process, depending on how the file is being used.
• The design of the file organization usually depends mainly on
the system environment.
• For instance, factors such as whether the file is going to be
used for transaction-oriented processes like OLTP or Data
Warehousing, or whether the file is shared among various
processes like those found in a typical distributed system or
• It must also be asked whether the file is on a network and
used by a number of users and whether it may be accessed
internally or remotely and how often it is accessed.
important considerations might be:
• Rapid access to a record or a number of records
which are related to each other.
• The Adding, modification, or deletion of records.
• Efficiency of storage and retrieval of records.
• Redundancy, being the method of ensuring data
• A file should be organized in such a way that the
records are always available for processing with
• This should be done in line with the activity and
volatility of the information.
Types of File Organization
• Organizing a file depends on what kind of file it happens to be a file
in the simplest form can be a text file (in other words a file which is
composed of ASCII (American Standard Code for Information
• Files can also be created as binary or executable types (containing
elements other than plain text.) Also, files are keyed with attributes
which help determine their use by the host operating system.
Techniques of File Organization
The three techniques of file organization are:
• Heap (unordered)
• Sequential (SAM)
• Line Sequential (LSAM)
• Indexed Sequential(ISAM)
• Hashed or Direct
• In addition to the three techniques, there are four methods of
organizing files. They are sequential, line sequential, indexed-
sequential, inverted list and direct or hashed access organization.
• A sequential file contains records organized in the order
they were entered.
• The order of the records is fixed.
• The records are stored and sorted in physical, contiguous
blocks within each block the records are in sequence.
• Records in these files can only be read or written
sequentially. Once stored in the file, the record cannot be
made shorter, or longer or deleted.
• However, the record can be updated if the length does not
• New records will always appear at the end of the file.
• If the order of the records in a file is not important,
sequential organization will suffice, no matter how many
records you may have.
• Sequential output is also useful for report printing or
sequential reads which some programs prefer to do.
• Line-sequential files are like sequential files,
except that the records can contain only
characters as data.
• Line-sequential files are maintained by the
native byte stream files of the operating
system. In the COBOL environment, line-
sequential files that are created with WRITE
statements with the ADVANCING phrase can
be directed to a printer as well as to a disk.
• Key searches are improved by this system too.
• The single-level indexing structure is the simplest one where
a file, whose records are pairs, contains a key pointer.
• This pointer is the position in the data file of the record with
the given key.
• A subset of the records, which are evenly spaced along the
data file, is indexed, in order to mark intervals of data
• This is how a key search is performed: the search key is
compared with the index keys to find the highest index key
coming in front of the search key, while a linear search is
performed from the record that the index key points to, until
the search key is matched or until the record pointed to by
the next index entry is reached.
• Regardless of double file access (index + data) required by
this sort of search, the access time reduction is significant
compared with sequential file searches.
Direct or Hashed Access
• With direct or hashed access a portion of disk
space is reserved and a “hashing” algorithm
computes the record address.
• So there is additional space required for this
kind of file in the store.
• Records are placed randomly throughout the
• Records are accessed by addresses that specify
their disc location.
• Also, this type of file organization requires a disk
storage rather than tape.
• It has an excellent search retrieval performance,
but care must be taken to maintain the indexes.
External File Structure and File Extensions
Microsoft Windows and MS-DOS File Systems
• The external structure of a file depends on whether it
is being created on a FAT or NTFS partition.
• FAT(File Allocation Table)
• NTFS(New Technology File System)
• VFAT(Virtual File Allocation Table)
• The maximum filename length on a NTFS partition is
• 11 characters on FAT (8 character name+”.”+3 character
• NTFS filenames keep their case, whereas FAT filenames
have no concept of case
• Also, there is the new VFAT which permits 256
UNIX and Apple Macintosh File Systems
• The concept of directories and files is fundamental to
the UNIX operating system.
• On Microsoft Windows-based operating systems,
directories are depicted as folders and moving about
is accomplished by clicking on the different icons.
• In UNIX, the directories are arranged as a hierarchy
with the root directory being at the top of the tree.
• The root directory is always depicted as /. Within the
/ directory, there are subdirectories
• Files can be written to any directory depending on
• Files can be readable, writable and/or executable.
DATA COMMUNICATION—AN OVERVIEW
• Data Communication defined as the transfer of
information between two points,
• either via an analogue (sine wave) electrical signal
• or digital (binary) signal via electrical pulses
• or optically via light pulses
• A simple scenario would be two personal computers in
the same building, but 50 feet away from each other.
• By hooking up a cable between the two personal
computers, we now have Data Communications.
• The extent of Data Communications builds from this
point on, since there are many factors such as
• distance, topology, protocol, signalling, security, etc.
that determine how Data Communications will take
Analog Signals Vs Digital Signal
• Analog signal is any continuous signal for which the time
varying feature (variable) of the signal is a representation
of some other time varying quantity.
• Analog is usually thought of in an electrical context;
however, mechanical, pneumatic, hydraulic, and other
systems may also convey analog signals.
• Digital signals consist of patterns of bits of information.
• Modern digital computers store and process all kinds of
information as binary patterns.
• All the pictures, text, Sound and video stored in this
computer are held and manipulated as patterns of binary
• The main advantage of digital signals over analog signals is
that the precise signal level of the digital signal .
• Signals can be in analog or digital form.
• Analog signals can have an infinite number of
values in a range; digital signals can have only a
limited number of values.
• The difference between analog and digital is
similar to the difference between continuous-
time and discrete-time.
• Analog corresponds to a continuous set of
possible function values,
• while digital corresponds to a discrete set of
possible function values.
• A common example of a digital signal is a binary
sequence, where the values of the function can
only be one or zero
Periodic vs Aperiodic
• Periodic signals repeat with some period T, while
aperiodic, or non periodic, signals do not. We can
define a periodic function through the following
mathematical expression, wheretcan be any number
and T is a positive constant:
• If a continuous time signal does not have a definite
pattern and does not repeat at regular intervals of time
is known as continuous time aperiodic
signal or nonperiodic signal.
• In other words, a signal x(t) for which no value of time
t satisfies the condition of periodicity, is known
as aperiodic or non-periodic signal.
• Data Rate limits . how fast we can send data, in bits
per second over a channel. Data rate depends on
• The bandwidth available
• The levels of signals we can use
• The quality of the channel(level of the noise)
Noiseless Channel: Nyquist Bit Rate For a noiseless
channel, The Nyquist bit rate formula defines the
theoretical maximum bit rate
Here bandwidth is the bandwidth of the channel
, L is the number of signal levels used to represent
Bit Rate is bits per second.
Noisy Channel: Shannon Capacity
• In reality, we cannot have a noiseless channel: the
channel is always noisy. In 1944 Claude Shannon
introduced a formula, called the Shannon Capacity to
determine the theoretical highest data rate for a noisy
• Capacity=Bandwidth* Log2(1+SNR)
• Bandwidth is the Bandwidth of the channel,
• SNR is the signal-to- noise ratio and
• capacity is the capacity of the channel in bits per
• SNR is the ratio of the power of the signal to power of
Basic Data Communication Model
• Communication is the conveyance of a message from one
entity, called the source to the destination.
• A simple example of such a communication system is
conversation; people commonly exchange verbal messages,
with the channel consisting of waves of compressed air
molecules at frequencies which are audible to the human
• The only way that a message source can be certain that
the destination properly received the message is by some
kind of acknowledgment response from the destination.
• Conversing people might say "I understand" in response to
a statement made by their peer. This acknowledged form
of dialogue is the basis of reliable communications.
• The conveyance of a message could be direct between the
corresponding entities or it could be indirect, with one or
more intermediaries participating in the message transport.
• Communication can be from a source to a single destination,
known as point-to-point.
• To multiple destinations, known as point-to-multipoint or
• A special case of multicast is the conveyance of a message
from a source to every possible destination, which is
referred to as broadcast.
• The broadcast can be local or global in scope.
• Depending on the definitions of source, destination and
channel, the communication could be Asynchronous or
• Communication is called asynchronous if the sender and
receiver do not need to synchronies before each
• In synchronous communication, there is a minimal assumed
timing relationship between the source and destination.
The Communications Channel
• A communications channel is a pathway over
which information can be conveyed.
• A communication channel can be simplex, in
which only one party can transmit.
• Full-duplex, in which both correspondents can
transmit and receive simultaneously,
• half-duplex, in which the correspondents
alternate between transmitting and receiving
• Communication between two entities can be
considered either in- band or out-of-band,
depending on context.
• In-band communication is communication
which occurs via the primary channel
between the communicating entities.
• Out-of-band communication occurs via an
alternative channel, which is not considered
to be the primary channel between the
• The message source is the transmitter, and the destination
is the receiver.
• A channel whose direction of transmission is unchanging is
referred to as a simplex channel.
• For example, a radio station is a simplex channel because it
always transmits the signal to its listeners and never allows
them to transmit back.
• A half-duplex channel is a single physical channel in which
the direction may be reversed.
• Messages may flow in two directions, but never at the
same time, in a half-duplex system.
• In a telephone call, one party speaks while the other
listens. After a pause, the other party speaks and the first
• Speaking simultaneously results in garbled sound that
cannot be understood.
• A full-duplex channel allows simultaneous
message exchange in both directions.
• It really consists of two simplex channels, a
forward channel and a reverse channel, linking
the same points.
• The transmission rate of the reverse channel
may be slower if it is used only for flow control
of the forward channel.
• bandwidth -how much information can be
conveyed across the channel in a unit of time,
commonly expressed in bits per second or bps.
• A higher bandwidth is usually a good thing in a
channel because it allows more information to
be conveyed per unit of time.
• High bandwidths mean that more users can
share the channel, depending on their means of
• High bandwidths also allow more demanding
applications (such as graphics) to be supported
for each user of the channel.
• Quality - how reliably can the information be correctly
conveyed across the channel, commonly in terms of bit
error rate ( BER) and whether the channel is dedicated
(to a single source) or shared (by multiple sources).
• A low quality channel is prone to distorting the
messages it conveys;
• a high quality channel preserves the integrity of the
messages it conveys.
• Depending on the quality of the channel in use
between communicating entities, the probability of the
destination correctly receiving the message from the
source might be either very high or very low.
• If the message is received incorrectly it needs to be
• Bit-serial transmission conveys a message one bit
at a time through a channel.
• Each bit represents a part of the message.
• The individual bits are then reassembled at the
destination to compose the message.
• In general, one channel will pass only one bit at a
• Thus, bit-serial transmission is necessary in data
communications if only a single channel is
• Bit-serial transmission is normally just called
serial transmission and is the chosen
communications method in many computer
• Byte-serial transmission conveys eight bits at a time
through eight parallel channels.
• The raw transfer rate is eight times faster than in bit-serial
transmission, eight channels are needed, and the cost may
be as much as eight times higher to transmit the message.
• When distances are short, it may nonetheless be both
feasible and economic to use parallel channels in return for
high data rates.
• The baud rate refers to the signalling rate at which data is
sent through a channel and is measured in electrical
transitions per second.
• The data rate of a channel is often specified by its bit rate .
• However, an equivalent measure channel capacity is
• In general, the maximum data rate a channel can support
is directly proportional to the channel’s bandwidth and
inversely proportional to the channel’s noise level.
• Parallel communication implies more than one
wire in addition to a ground connection.
• An 8-bit parallel channel transmits eight bits (or
a byte) simultaneously.
• A serial channel would transmit those bits one at
• If both operated at the same clock speed, the
parallel channel would be eight times faster.
• A parallel channel will generally have additional
control signals such as a clock, to indicate that
the data is valid, and possibly other signals for
handshaking and directional control of data
• A communications protocol defines the order and
meaning of bits in a serial transmission.
• It specify a procedure for exchanging messages.
• A protocol will define how many data bits compose a
message unit, the framing and formatting bits, any
error-detecting bits that may be added and other
information that governs control of the
• Channel efficiency is determined by the protocol
design rather than by digital hardware considerations.
• Reliability-protocols provide greater immunity to
noise by adding error-detecting and -correcting codes.
• Baud Rate : The baud rate of a data communications
system is the number of symbols per second
• Modulation techniques are methods used to encode
digital information in an analog world.
• The 3 basic modulation techniques are:
• AM (amplitude modulation)
• FM (frequency modulation)
• PM (phase modulation)
• All 3 modulation techniques employ a carrier signal.
• A carrier signal is a single frequency that is used to
carry the data.
• For digital, the DATA is either a1 or 0. When we
modulate the carrier, we are changing its
characteristics to correspond to either a 1 or 0.
AM - Amplitude Modulation
• Amplitude Modulation modifies the amplitude of the
carrier to represent 1s or 0s.
• In the below, example, a 1 is represented by the presence
of the carrier for a predefined period of 3 cycles of carrier.
• Absence or no carrier indicates a 0.
• Simple to design.
• Noise spikes on transmission medium interfere with the
• Loss of connection is read as 0s.
Frequency Modulation (FM)
• Frequency Modulation modifies the frequency of
the carrier to represent the 1s or 0s.
• In the example below, a 0 is represented by the
original carrier frequency and a 1 by a much
higher frequency -the cycles are spaced closer
• Immunity to noise on transmission medium.
• Always a signal present. Loss of signal easily
• Requires 2 frequencies
• Detection circuit needs to recognize both
frequencies when signal is lost.
Phase Modulation (PM)
Phase Modulation modifies the phase of the carrier to
represent a 1 or 0.
• The carrier phase is switched at every occurrence of a 1
bit but remains unaffected for a 0 bit.
• The phase of the signal is measured relative to the
phase of the preceding bit.
• The bits are timed to coincide with a specific number
of carrier cycles (3 in this example = 1 bit).
• Only 1 frequency used
• Easy to detect loss of carrier
• Complex circuitry required to generate and detect
• What is the need for storage of information?
• What considerations are there while choosing a storage device for information?
• What are short & long term storage devices?
• Studying the file organization is an important aspect of computer science. Explain
• A file should be organized in such a way that the records are always available for
processing with no delay. What are the schemes for doing so?
• What are the techniques for searching records in a file?
• Write short notes on:
• Sequential (SAM)
• Line Sequential (LSAM)
• What is data communication?
• Define signals. Distinguish between periodic and a periodic signals.
• What are the factors on which data rate depends on?
• Describe Nyquist bit rate formula for noisy channel. 12.Describe Shannon Capacity
formula for noisy channel. 13.Distinguish between serial & parallel data
communication. 14.What are communication protocols and there relevance?
15.What are signal modulation techniques? Describe the following
• AM (amplitude modulation)
• FM (frequency modulation)
• PM (phase modulation)