Introduction:
A context-free grammar (CFG) is a term used in formal languages theory to describe a certain type of formal grammar. A context-free grammar is a set of production rules that describe all possible strings in a given formal language. Production rules are simple replacements. For example, the rule
A α
1. Introduction
Syed M. Hashir Raza Zaidi : 023R13-07
Davut Karasu: 023R13-30
Umut Papatya: 023R13-31
Sana-ur-Rehman: 023R13-39
BSCS-(6A)
Submitted To:
Mam Safia Sultana
2. Introduction:
A context-free grammar (CFG) is a term used in formal
languages theory to describe a certain type of formal
grammar. A context-free grammar is a set of production
rules that describe all possible strings in a given formal
language. Production rules are simple replacements. For
example, the rule
A α
3. Replaces {display style A} A with {display style alpha }α. There can
be multiple replacement rules for any given value. For example,
A α A β
means that A {display style A}A A can be replaced with
either {display style alpha }α α or {display style beta } β β
Context-free grammars arise in linguistics where they are used to
describe the structure of sentences and words in natural language,
and they were in fact invented by the linguist Noam Chomsky for this
purpose, but have not really lived up to their original expectation. By
contrast, in computer science, as the use of recursively-defined
concepts increased, they were used more and more. In an early
application, grammars are used to describe the structure
of programming languages. In a newer application, they are used in
an essential part of the Extensible Markup Language (XML) called
the Document Type Definition
4. In formal language theory, a context-free grammar.
CFG is said to be in Chomsky normal form (first described by Noam
Chomsky) if all of its production rules.
A context-free grammar (CFG) is a set of recursive rewriting rules (or
productions) used to generate patterns of strings.
A CFG consists of the following components:
a set of terminal symbols, which are the characters of the
alphabet that appear in the strings generated by the grammar.
a set of nonterminal symbols, which are placeholders for patterns of
terminal symbols that can be generated by the nonterminal symbols.
5. a set of productions, which are rules for replacing (or rewriting)
nonterminal symbols (on the left side of the production) in a string with
other nonterminal or terminal symbols (on the right side of the
production).
a start symbol, which is a special nonterminal symbol that appears in the
initial string generated by the grammar.
To generate a string of terminal symbols from a CFG, we:
Begin with a string consisting of the start symbol;
Apply one of the productions with the start symbol on the left hand size,
replacing the start symbol with the right hand side of the production;
Repeat the process of selecting nonterminal symbols in the string, and
replacing them with the right hand side of some corresponding
production, until all nonterminals have been replaced by terminal
symbols.
6. Context-Free Grammars:
A grammar is a set of rules for putting strings together and so
corresponds to a language.
A grammar consists of:
a set of variables (also called nonterminals), one of which is
designated the start variable; It is customary to use upper-case
letters for variables;
a set of terminals (from the alphabet); and
a list of productions (also called rules).
Example: 0n 1n Here is a grammar: S → 0S1 S → ε . S is the only
variable. The terminals are 0 and 1.
There are two productions.
Using a Grammar A production allows one to take a string
containing a variable and replace the variable by the RHS of the
production. String w of terminals is generated by the grammar if,
starting with the start variable, one can apply productions and end
up with w.
7. The sequence of strings so obtained is a derivation of w. We focus on
a special version of grammars called a context-free grammar (CFG). A
language is context-free if it is generated by
a CFG.
Example S → 0S1 S → ε
The string 0011 is in the language generated. The derivation is: S =⇒
0S1 =⇒ 00S11 =⇒ 0011 For compactness,
we write S → 0S1 | ε where the vertical bar means or.
Example:
Palindromes Let P be language of palindromes with alphabet { a, b }.
One can determine a CFG for P by finding a recursive decomposition.
If we peel first and last symbols from a palindrome, what remains is a
palindrome; and if we wrap a palindrome with the same symbol front
and back, then it is still a palindrome.
CFG is P → a P a | b P b | ε Actually, this generates only those of
even length. . .
8. Formal Definition
One can provide a formal definition of a contextfree grammar. It is a 4-tuple (V,Σ, S, P) where:
V is a finite set of variables;
Σ is a finite alphabet of terminals;
S is the start variable; and
P is the finite set of productions. Each production has the form V → (V ∪ Σ)∗.
Further Examples:
Even 0’s A CFG for all binary strings with an even number of 0’s. Find the decomposition. If first
symbol is 1, then even number of 0’s remain. If first symbol is 0, then go to next 0; after that again
an even number of 0’s remain.
This yields: S → 1 S | 0 A 0 S | ε A → 1 A | ε . Alternate CFG for Even 0’s Here is another CFG for
the same language. Note that when first symbol is 0, what remains has odd number of 0’s. S → 1
S | 0 T | ε T → 1 T | 0 S
Example
A CFG for the regular language corresponding to the Recursive Expression 00∗11∗. The
language is the concatenation of two languages: all strings of zeroes with all strings of ones. S →
CD, C → 0C | 0 ,D → 1D | 1
Example
Complement A CFG for the complement of RE 00 ∗11 ∗ . CFGs don’t do “and”s, but they do do
“or’s” A string is not of the form 0 i 1 j where i, j > 0 is one of the following: contains 10; is only
zeroes; or is only ones. This yields CFG: S → A | B | C, A → D10 D ,D → 0 D | 1 D | ε ,B → 0 B | 0
,C → 1 C | 1
11 Consistency and Completeness Note that to check a grammar and description match, one
must check two things: that everything the grammar generates fits the description (consistency),
and everything in the description is generated by the grammar (completeness).
9. Example
Consider the CFG S → 0S1S | 1S0S | ε The string 011100 is generated:
S =⇒ 0S1S =⇒ 01S =⇒ 011S0S =⇒ 0111S0S0S =⇒ 01110S0S =⇒ 011100S =⇒
011100
What does this language contain?
Certainly every string generated has equal 0’s and 1’s. . .
But can any string with equal 0’s and 1’s be generated?
Example Argument for Completeness
Yes. All strings with equal 0’s & 1’s are generated: Well, at some point,
equality between 0’s and 1’s is reached. The key is that if string starts
with 0, then equality is first reached at a 1. So the portion between first
0 and this 1 is itself an example of equality, as is the portion after this 1.
That is, one can break up string as 0 w 1 x with both w and x in the
language. The break-up of 00101101: 0 0 1 0 1 1 0 1 w x
10. . A Silly Language CFG This CFG generates sentences as composed
of noun- and verb-phrases: S → NP. VP .N,
P → the N V ,P → V NP, V → sings| eats N → cat | song | canary This
generates “the canary sings the song”, but also “the song eats the
cat”. This CFG generates all “legal” sentences, not just meaningful
ones.
Practice Give grammars for the following two languages:
1. All binary strings with both an even number of zeroes and an even
number of ones.
2. All strings of the form 0a1b0c such that a + c = b.
Practice Solutions
1) S → 0X | 1Y | ε, X → 0S | 1Z (odd zeroes, even ones) Y → 1S | 0Z
(odd ones, even zeroes) Z → 0Y | 1X (odd ones, odd zeroes)
2) S → TU , T → 0T1 | ε , U → 1U0 | ε
11. Summary :
A context-free grammar (CFG) consists of a set of productions that
you use to replace a variable by a string of variables and terminals.
The language of a grammar is the set of strings it generates. A
language is context-free if there is a CFG for it.