1. CAP-204
(FUNDAMENTALS OF DATA STRUCTURES)
TERM PAPER
ON
Role of Data Structures in Programming languages
SUBMITTED TO:
Tajinder Mam
SUBMITTED BY:
Sachin Raj
A34
G(A)
RD3901
2. ACKNOWLEDEGEMENT
I would like to acknowledge and extend my heartfelt gratitude to the
following persons who have made the completion of this Term Paper
possible:
Our Chancellor, Mr. Ashok Mittal for his vital encouragement and
support.
Our Executive Dean, Mrs. Rashmi Mittal for her understanding and
assistance.
Tajinder Mam(Lecturer) for the help and inspiration he extended.
Most especially to my family and friends for assisting in the collection
of the topics for the term paper.
and to God, who made all things possible.
3. Table of Contents
Data Structures
A data structure is an arrangement of data in a computer's memory or even disk
storage. An example of several common data structures are arrays, linked lists,
queues, stacks, binary trees, and hash tables. Algorithms, on the other hand, are
used to manipulate the data contained in these data structures as in searching
and sorting.
Many algorithms apply directly to a specific data structures. When working with
certain data structures you need to know how to insert new data, search for a
specified item, and deleting a specific item.
Commonly used algorithms include are useful for:
Searching for a particular data item (or record).
Sorting the data. There are many ways to sort data. Simple sorting,
Advanced sorting
Iterating through all the items in a data structure. (Visiting each item in
turn so as to display it or perform some other action on these items)
Characteristics of Data Structures
Data Structure Advantages Disadvantages
Quick inserts Slow search
Array Fast access if index Slow deletes
known< Fixed size
Slow inserts
Faster search than
Ordered Array Slow deletes
unsorted array
Fixed size
Slow access to
Stack Last-in, first-out acces
other items
Slow access to
Queue First-in, first-out access
other items
Linked List Quick inserts Slow search
4. Data Structure Advantages Disadvantages
Quick deletes
Quick search
Quick inserts
Deletion algorithm
Binary Tree Quick deletes
is complex
(If the tree remains
balanced)
Quick search
Quick inserts
Complex to
Red-Black Tree Quick deletes
implement
(Tree always remains
balanced)
Quick search
Quick inserts
Quick deletes
Complex to
2-3-4 Tree (Tree always remains
implement
balanced)
(Similar trees good for
disk storage)
Slow deletes
Very fast access if key is Access slow if key
Hash Table known is not known
Quick inserts Inefficient memory
usage
Quick inserts
Slow access to
Heap Quick deletes
other items
Access to largest item
Some algorithms
Best models real-world
Graph are slow and very
situations
complex
Data structure used in Language
Assembly languages and some low-level languages, such as BCPL, generally lack
support for data structures. Many high-level programming languages, on the
5. other hand, have special syntax or other built-in support for certain data
structures, such as vectors (one-dimensional arrays) in the C language, multi-
dimensional arrays in Pascal, linked lists in Common Lisp, and hash tables in Perl
and in Python. Many languages also provide basic facilities such as references and
the definition record data types, that programmers can use to build arbitrarily
complex structures.
Most programming languages feature some sort of library mechanism that allows
data structure implementations to be reused by different programs. Modern
languages usually come with standard libraries that implement the most common
data structures. Examples are the C++ Standard Template Library, the Java
Collections Framework, and Microsoft's .NET Framework.
Modern languages also generally support modular programming, the separation
between the interface of a library module and its implementation. Some provide
opaque data types that allow clients to hide implementation details. Object-
oriented programming languages, such as C++, .NET Framework and Java, use
classes for this purpose.
With the advent of multi-core processors, many known data structures have
concurrent versions that allow multiple computing threads to access the data
structure simultaneously.
Data structure used in c++ and c language
In c++ datastructure are used:
Linear Data Structure
ARRAY
6. LINKEDLIST
STACK
QUEUES
RECURSION
Non-linear Data Structure
TREES
GRAPH
Array
Linear Arrays:
A linear array is a list of a finite number n of homogeneous data
elements( Data elements of same type) such that:
The elements of the array are referenced respectively by an
index set consisting of n consecutive numbers.
The elements of the array are stored respectively in successive
memory location.
The number n of elements is called the length or size of an array.
Length=UB-LB+1
UB=Upper Bound
LB=Lower Bound
Example:
7. A[5],A[10]
A[1],A[2],A[3],A[4]---------------,A[10]
Representation of linear Arrays
A[0] 200 1
A[1] 201 2
A[2] 202 3
A[3] 203 4
A[4] 204 5
A[5] 205 6
A[6] 206 7
A[7] 207 8
LOC(LA[K]=Base(LA)+w(K-Lower bound)
1) LOC(LA[K]=address of the element LA[K] of the array LA
2) Base(LA)= Base Address of LA
3) W=number of words per memory cell for the array LA.
Operation on arrays
Traversing
Insertion
Deletion
Searching
Sorting
Inserting in Linear Array
8. INSERT(LA, n, k, Item)
1. [Initialize counter] Set j=n
2. Repeat steps 3 and 4 while j>=k.
3.[Move jth Element Downward] Set LA[j+1]=LA[j]
4. [Decrease Counter ] Set j=j-1
[End of step 2 loop]
5. [Insert Element] Set LA[k]=Item
6. [Reset n] Set n=n+1
7. Exit
Deleting from linear Array
1. Set Item=LA[k]
2.Repeat for j=k to n-1
[Move j+1st element upward] Set LA[j]=La[j+1]
[End of loop]
3. [Reset n] Set n=n-1
4. Exit
Sorting of Array in Data Structure
Sorting:
9. Let A be a list of n numbers. Sorting A refers to the operation of
rearranging the elements of A so they are in increasing order, i.e.
A[1]<A[2]<A[3]<……<A[n]
EXAMPLE-
8,4,19,2,7,13
After Sorting,
2,4,7,8,13,19
Types of Sorting techniques:
Bubble sort
Selection Sort
Insertion Sort
Merge Sort
Quick Sort
Heap Sort
Bubble Sort:
Suppose the list of numbers A[1],A[2],…..,A[n] is in memory. The
bubble sort algorithm works as follows:
Step 1: Compare A[1] and A[2] and arrange them in desired order, so
that A[1]<A[2], then compare A[2] and A[3]and arrange them such
that A[2]<A[3] continue until we compare A[N-1] with A[N] and
arrange them so that A[N-1]<A[N].
10. Note: Step 1 Involves n-1 comparisons. When step1 will complete
A[N], will contain the largest element.
Step 2: Repeat Step 1 with one less comparison; that is now we stop
after we compare and possibly rearrange A[N-2] and A[N-1].( Step 2
involves n-2 Comparisons)
Step N-1: Compare A[1] and A[2] and Arrange them so that A[1]<A[2].
The process of sequentially traversing through all r part of a list is
frequently called a “Pass”.
Example:
30,55,20,82,63,19,13,57
PASS1:
1) Compare 30<55 No change.
2) Compare 55<20(interchange) 30,20,55,82,63,19,13,57
3) Compare 55<82 no change.
4) Compare 82<63 (interchange) 30,20,55,63,82,19,13,57
5) Compare 82<19(interchange) 30,20,55,63,19,82,13,57
6) Compare 82<13 (interchange) 30,20,55,63,19,13,82,57
7)Compare 82<57(interchange) 30,20,55,63,19,13,57,82
At pass 1 82 reached it correct Nth position.( N-1 Comparison)
30,20,55,63,19,13,57,82
Pass 2:
11. 1) Compare 30<20( Interchange) 20,30,55,63,19,13,57,82
2) Compare 30<55 No change
3) Compare 55<63 no change
4) Compare 63<19 (interchange)30,20,55,19,63,13,57,82
5) Compare 63<13 (interchange) 30,20,55,19,13,63,57,82
6) Compare 63<57(interchange) 30,20,55,19,13,57,63,82
At pass2 2nd largest no reach N-1 position(N-2 Comparison)
The same procedure will repeat till n-1 no of passes and after n-1 pass
the data will be sorted.
Selection sort:
Suppose an array A with n elements A[1], A[2],….,A[N] is in memory.
The selection sort algorithm first find the smallest element in the list
and put it in the first position. Then find the second smallest element
in the list and put in the second position and so on.
Pass1: find the location loc of the smallest in the list of N elements.
A[1],A[2],…,A[N] and then interchange A[LOC] and A[1]. Then A[1] is
Sorted.
Pass2: find the location LOC of the smallest in the sublist of N-1
Elements. A[2],A[3]….A[N], and then interchange A[loc] and A[2].
Pass N-1: find the loc of the smaller of the elements A[N-1],A[N] and
then interchange A[LOC] and A[N-1] then A[1], A[2]……A[N] is sorted
Example:
13. K=7, 11 22 33 44 55 66 77 88
LOC=7
Insertion Sort:
Suppose an array A with n elements A[1],A[2],…..,A[N] is in
memory. The insertion sort algorithm scans A from A[1] to A[N],
inserting each element A[K] in to its proper position in the
previously sorted sub array a[1],A[2],……,A[K-1] i.e.
Pass1 : A[1] by itself is trivially sorted.
Pass2 : A[2] is inserted either before or after A[1] so that: A[1],
A[2] is sorted.
Pass 3 : A[3] is inserted into its proper place in A[1], A[2] that is
before A[1], between A[1] and A[2], or after A[2]. So that
A[1],A[2],A[2] is sorted.
Pass N: A[N] is inserted in to its proper place in
a[1],A[2],……., A[N-1]
Merge sort:
14. Merging(A,R,B,S,C)
1.[Initialize] Set NA=1, NB=1 and ptr =1
2.[Compare] Repeat while NA<=R and NB<=S
if A[NA]<B[NB], then
a) [Assign Element from A to C] Set c[PTR]=A[NA]
b) [Update pointers] Set ptr=ptr+1 and NA=NA+1
Else
a) [Assign Element from B to C] Set c[PTR]=B[NB]
b) [Update pointers] Set ptr=ptr+1 and NB=NB+1
[End of if Structure]
[End of loop]
3. [Assign Remaining Elements to C]
If NA>R then
Repeat for k=0,1,2,…..S-NB;
C[PTR+k]=B[NB+k]
[End of Loop[]
Else:
Repeat for k=0,1,2,…..R-NA;
C[PTR+k]=A[NA+k]
[End of Loop[]
15. [End of if structure]
4. Exit
Merge Sort Example
66,33,40,22,55,88,60,11,80,20,50,44,77,30
Each pass of the merge-sort algorithm will start at the begning of the
array A and merge pairs of sorted sub arrays as follows:
Pass 1: Merge each pair of elements to obtain the following list of
sorted pairs.
33,66 22,40 55,88 11,60 20,80 44,50 30,70
Pass 2: Merge each pairs of pairs to obtain the following list of sorted
elements.
22,33,40,60 11,55,60,88 20,44,50,80 30,70
Pass 3: Merge each pair of sorted elements to obtain the following
two sorted sub arrays.
11,22,33,40,55,60,66,88 20,30,44,50,77,80
Pass 4: Merge the two sorted list
11,20,22,30,33,40,44,50,55,60,66,77,80,88
LINK LIST
List: List refers to a linear collection of data items.
16. Link-List: A linked list or a one way list is a linear collection of
data elements called nodes, Where the linear order is given by
means of pointers.
Each node in the list is divided in two parts: The first part
contains the information and the second contains the address of
the next node in the list called link field.
The Linked List contains a list pointer variable called start
pointer which contains the address of the first node in the list.
The null pointer signals the end of the list.
A special case is the list that has no nodes such a list is called the
null list or empty list.
Representation of Linked lists in memory.
Start=4 So info[4]=M
Link[4]=2 so info[2]= C
Link[2]=8 so info[8]=A
Link[8]=NULL
INFO LINK
17. Operation on link-list:
Traversing
Insertion
Deletion
Searching
Sorting
Traversing a link-list
Algorithm:
18. Let LIST be a linked list in memory. This algorithm traverse LIST,
applying an operation process to each element of LIST. The variable
PTR points to the node currently being processed.
1) Set PTR:=Start [Initializes pointer PTR]
2) Repeat Steps 3 and 4 while PTR!=NULL.
3) Apply Process to info[ptr].
4) Set PTR=link[ptr] [now points to the next node]
[End of step 2 loop]
5) Exit
Searching in Link List:
1) Set PTR=START.
2) Repeat Step 3 While PTR!=NULL
3)if item = INFO[PTR] then
Set LOC=PTR;
Else
Set PTR=Link[PTR].
4) Set LOC=NULL
5) EXIT
Insertion in to Link List:
Three insertion can be done:
19. Inserts a node at the beginning of the list.
inserts a node after the node with a given location.
insert a node in sorted list
Insertion at the beginning of a list:
INSFIRST(INFO,LINK,START,AVAIL,ITEM)
1) [Overflow ] if Avail =Null, then write :overflow and Exit.
2) [Remove first node from Avail list]
Set New=Avail and Avail=Link[Avail].
3)Set Info[New]=Item [Copies New data into new node]
4)Set Link[New]=Start [New Node now points to original First Node]
5)Set Start=New[Change Start so it points to the new node]
6)Exit.
Inserting into Sorted Link-List:
FINDA(INFO,LINK,START,ITEM,LOC)
1.[LIST empty] if Start=Null, then Set LOC=NULL and return.
2.[Special Case] if Item<Info[Start], then Set LOC=Null and Return.
3. Set Save=Start and PTR=Link[Start]
4. Repeat Steps 5 and 6 while ptr!=null
5. if item<info[ptr] then:
set Loc=Save and return
20. [end of if]
6. Set Save=Ptr and Ptr=link[Ptr]
[end of step 4 loop]
7. Set loc=Save.
8.Return
Deletion From a link list:
Deleting a node with a given item of information:
Find B( Info, Link, Start, Item, Loc, Locp ):
This procedure finds the location LOC of the first node N which
contains ITEM and the location LOCP of the node preceding N. if item
does not appear in the list then the procedure sets LOC=NULL and if
item appears in the first node then it sets LOCP=NULL
1. If Start=NULL then
Set LOC=NULL and LOCP=NULL and return
2. [Item in First Node] If INFO[Start]=ITEM then
Set LOC=Start and LOCP=NULL and Return.
3. Set Save=Start and Ptr=Link[Start]
4. Repeat Steps 5 and 6 while Ptr!=Null
5. If Info[PTR]=Item then
Set LOC=PTR and LOCP=Save
6. Set Save=PTR and ptr=link[PTR] [End of Step 4 Loop]
21. 7. Set LOC=NULL
8.Return
Header Link-List:
A header link list is a link list which always contains a special node,
called a header node at the beginning of the list. The following are the
two kinds of widely used header lists:
1.A grounded header list is a header list where the last node contains
a NULL pointer.
2.A Circular header list is a header list where the last node points back
to the header Node.
• Circular header lists are frequently used instead of ordinary link-
lists because many operations are much easier to state and
implement using header lists.
1) The Null Pointer is not used and hence all pointers contain valid
addresses.
2) Every node has a predecessor, so the first node may not require a
special case.
Stack
A Stack is a list of elements in which an element may be inserted or
deleted only at one end called the top of the stack Lists( LIFO (Last In,
First Out) )
Basic operations of stack
22. “Push” is the term used to insert an element into a stack.
“Pop” is the term used to delete an element from stack. etc.
Array Implementation
Need to declare an array size ahead of time
Associated with each stack is TopOfStack
for an empty stack, set TopOfStack to -1
Push
(1) Increment TopOfStack by 1.
(2) Set Stack[TopOfStack] = X
Pop
(1) Set return value to Stack[TopOfStack]
(2) Decrement TopOfStack by 1
These operations are performed in very fast constant time
PUSH(STACK, Top, MaxStk, Item)
1.[Stack already filled]
if top=MAXSTK then print: overflow and return
2) Set TOP=TOP+1 [Increases top by 1]
3) Stack[TOP]=Item.
4) Return.
23. POP( Stack, Top, Item)
1.[Stack Empty]
if top=0 then print: Under flow and return
2) ) Item =. Stack[TOP] [Assign top element to item]
3) Set TOP=TOP-1 [Decreases top by 1]
4) Return.
Application of Stacks:
Postponed Decisions
Quick Sort.
Arithmetic Expressions(Polish Notation)
Polish Notation:
1) For most common arithmetic operations, the operator symbol is
placed between its two operands: Example: A+B,C-D
2) This is called infix Notation. With this notation, we must
distinguish between (A+B)* C and A + (B*C). By using
parantheses or some operator-precedence convention.
24. 3) Polish Notation: Refers to the notation in Which the operator
symbol is placed before its two operands example: +AB,-CD
Quick Sort:
Quick sort is an algorithm of the divide and conquer type.that is the
problem of sorting a set is reduced to the problem of sorting two
smaller sets using two Stacks Lower and Upper. Example:
44,33,11,55,77,90,40,60,99,22,88,66
1.Use the first element (44) . Beginning with last number 66 scan the
list from right to left comparing each number with 44 and stopping at
the first number less than 44. the number is 22 interchange 44 with
22.
22,33,11,55,77,90,40,60,99,44,88,66
2.Beginning with 22 next scan the list in the opposite direction from
left to right comparing each number with 44 and stopping at the first
number greater than 44. the number is 55 interchange 55 with 44.
22,33,11,44,77,90,40,60,99,55,88,66
Recursion:
Suppose P is a procedure containing either a call statement to itself or
a call statement to a second procedure that may eventually result in a
call statement back to the original procedure P then P is called a
Recursive Procedure. A program will not continue to run indefinitely a
recursive procedure must have the following two properties:
1) There must be certain criteria, called base criteria for which the
procedure does not call itself.
25. 2) Each time the procedure does call itself, it must be closer to the
base Criteria.
A recursive procedure with these two properties is said to be well
defined.
Queue
Like a stack, a queue is also a list. However, with a queue, insertion is
done at one end, while deletion is performed at the other end.
Accessing the elements of queues follows a First In, First Out (FIFO)
order.
Like customers standing in a check-out line in a store, the first
customer in is the first customer served.
Basic operations:
enqueue: insert an element at the rear of the list
dequeue: delete the element at the front of the list
Queue Insert
This procedure Inserts an element ITEM into a queue.
[Queue Already Filled?]
If FRONT=1 and REAR=N or if FRONT = REAR +1 then write overflow.
2) [Find New Value of Rear]
If FRONT=NULL then [QUEUE Initially Empty]
SET FRONT=1 and REAR=1
26. ELSE if REAR=N then
Set REAR=1
ELSE Set REAR=REAR+1
[End of if]
3) Set Queue[REAR]=Item.
4) Return.
There are several different algorithms to implement Enqueue and
Dequeue
Queue Implementation of Array
Naïve way
When enqueuing, the front index is always fixed and the rear index
moves forward in the array.
Dequeue
A deque is a linear list in which elements can be added or
removed at either end but not in middle( Left ,Right pointer).
There are two variation of a Deque- namely an Input Restricted
and an output-Restricted deque.
An input restricted deque is a deque which allows insertions at
only one end of a list but allows deletions on both the ends of
the list
27. An output restricted Deque is a deque which allows deletions at
only one end of the list but allows insertions at both the ends of
the list.
Tree
Arrays, linked lists, stacks and queues are used to represent
linear and tabular data.
These structures are not suitable for representing hierarchical
data.
In hierarchical data we have
ancestors,
descendants,
superiors,
subordinates, etc
Introduction to Trees
Fundamental data storage structures used in programming
Combine advantages of ordered arrays and linked lists
Searching can be made as fast as in ordered arrays
Insertion and deletion as fast as in linked lists
Tree characteristics
28. Consists of nodes connected by edges
Nodes often represent entities such as people, car parts etc.
Edges between the nodes represent the way the nodes are
related.
The only way to get from node to node is to follow a path along
the edges.
Tree Terminology
Root : node at the top of tree and without parent (A)
Internal node: node with at least one child (A, B, C, F)
External node: (leaf) node without children (E, I, J, K, G, H, D)
Ancestors of a node: parent, grandparent, grand-grandparent,
etc
Height of a tree: maximum depth of any node (4)
Descendant of a node: child, grandchild, grand-grandchild, etc
Degree of an element: no. of children it has
29. Subtree : tree consisting of a node and its descendants
Path: traversal from node to node along the edges that results in
a sequence
Binary Trees
Every node in a binary tree can have at most two children.
The two children of each node are called the left child and right
child corresponding to their positions.
A node can have only a left child or only a right child or it can
have no children at all.
Application on binary Tree
arithmetic expressions
decision processes
searching
Complete Binary Trees
30. It is that binary tree in which every level is fully occupied except,
possibly, for the bottom level which is filled from left to right.
complete B-Tree
Not complete B-Tree
Binary search tree(or ordered binary tree)
It is a node-based binary tree data structure which has the following
properties:
The left subtree of a node contains only nodes with keys less
than the node's key.
The right subtree of a node contains only nodes with keys
greater than the node's key.
Both the left and right subtrees must also be binary search trees.
31. Searching and Inserting in BST
Suppose an ITEM of information is given. The following algorithm
finds the location of ITEM in the binary search tree T, or inserts ITEM
as a new node in its appropriate place in the tree.
a) Compare ITEM with the root node N of the tree.
(i) If ITEM<N, proceed to the left child of N
(ii) If ITEM>N, proceed to right child of N
b) Repeat Step (a) until one of the following occurs:
(i) We meet a node N such that ITEM=N. In this case the search is
successful.
(ii) We meet an empty subtree, which indicates that the search is
unsuccessful and we insert.
Graph
32. Definition
A graph is a datastructure consisting of:
_ a set of vertices (or nodes).
_ a set of edges (or links) connecting the vertices.
ie, G = (V, E) where V is a set of vertices, E = set of edges, and each
edge is formed from pair of distinct vertices in V
If we represent our problem data using a graph data structure, can
use standard graph algorithms (often available from code libraries) to
solve it.
Graph Algorithms
Graph algorithms that we will look at include:
Searching for a path between two nodes. Can be used in game
playing, AI, route finding, ..
Finding shortest path between two nodes.
Finding a possible ordering of nodes given some constraints.
Example:
finding order of modules to take; order of actions to complete a task.
A Graph ADT: Operations
Need to define:
Operations for modifying and inspecting graph.
Data structure for graph itself.
33. For simple undirected, unlabelled graph, a small set of operations is
enough, to:
Create a graph.
Add and remove edges to the graph.
Check if an edge exists.
If we assume all nodes are indentified by a number, following C++
functions can be used:
graph(); // constructor
˜graph(); // and destructor
// (may be empty)
void addedge(int n1, int n2);
void removedge(int n1, int n2);
logical edgeexists(int n1, int n2);
A Simple Graph ADT using C++ Classes
Using above representation we can have following
very simple class definition:
class graph()
{
34. public:
graph();
˜graph();
void addedge(int n1, int n2);
void removedge(int n1, int n2);
logical edgeexists(int n1, int n2);
private:
logical g[MaxSize][MaxSize];
}
This uses a fixed size array. Not ideal as may want graphs of varying
size. May use pointers to allow variable sized arrays. Also can improve
with private variable to denote size of graph, and constructor
argument to set size.
Graph ADT: Implementation
Adjacency matrix method.
Use N x N array of boolean values:
0 1 2 3
0 F T T F
1 T F T F
2 T T F T
3 F F T F
35. (or can just use integer, and 1/0)
If array name is G, then G[n][m] = T iff edge exists between node n
and node m.
Refrences
http://www.idevelopment.info/data/Programming/data_structures/overview/Data_Structures_Algorith
ms_Introduction.shtml
http://en.wikipedia.org/wiki/Data_structure
Data structure notes on UMS
http://cprogramminglanguage.net/c-data-structure.aspx
BOOK:
DATA STRUCTURES(Seymour lipschutz)