Alphabets, Strings, Languages - York University · CSE2001, Fall 2006 3 Definition: A string w...

4

Click here to load reader

Transcript of Alphabets, Strings, Languages - York University · CSE2001, Fall 2006 3 Definition: A string w...

Page 1: Alphabets, Strings, Languages - York University · CSE2001, Fall 2006 3 Definition: A string w over some alphabet Σ is called a palindrome if w = wR. • Definition: A language

CSE2001, Fall 2006 1

Alphabets, Strings, Languages

• Definition: An alphabet is a finite set of objects called symbols.

Notation: Σ = {a, b, . . . , z}

• Definition: A string over an alphabet Σ is a finite sequence of symbols from Σ.

Notation: w, x, y, . . . for strings. Instead of w = (w1, w2, . . . , wk) we will simply writew = w1w2 . . . wk.

Note: strings over binary alphabet {0, 1} are often called binary strings.

• Definition: The length of string is the number of symbols contained in the string.

Notation: |w|

• The empty string ε.

• Relations between strings:

– Equality: w = y

Definition: Let w = w1 . . . wk, and y = y1 . . . ym be two non-empty strings. Wesay that w equals to y (in symbols, w = y) if k = m, and for all i, 1 ≤ i ≤ m, wehave wi = yi. The empty string ε equals only to itself.

– Lexicographic order: w < y (can be defined when there is an ordering of symbolsin the alphabet).

Page 2: Alphabets, Strings, Languages - York University · CSE2001, Fall 2006 3 Definition: A string w over some alphabet Σ is called a palindrome if w = wR. • Definition: A language

CSE2001, Fall 2006 2

Definition: Let Σ be an alphabet with defined order < between symbols, andlet w = w1 . . . wk, and y = y1 . . . ym be two non-empty strings over Σ. Then, w islexicographically smaller than y (in symbols, w < y), if

1. k < m, or

2. k = m, and for some i, 1 ≤ i ≤ m, we have that wi < yi, while for all j,1 ≤ j < i, we have wj = yj.

The empty string, ε, is lexicographically smaller than any non-empty string.

• Operations on strings:

– Concatenation: wy, wk

Definition: Let w = w1 . . . wk and y = y1 . . . yk be two strings over some alpha-bet Σ. Then the concatenation of w and y (in symbols w · y, or just wy) is thestring w1 . . . wky1 . . . yk.

Note: concatenation is associative, i.e. x(yz) = (xy)z, so we will simply writexyz.

Notation: by wk we denote w concatenated with itself k times, i.e. wk =www . . . w︸ ︷︷ ︸

k times

.

– Substring

Definition: Let w be a string over some alphabet Σ. A string y is called a sub-

string of w, if there are two strings x and z, such that w = xyz. If x = ε then yis called a prefix of w. If z = ε, then y is called a suffix of w.

– Reverse: wR

Definition: Let w be a string over some alphabet Σ. Then the reverse of w(in symbols wR) is defined as follows: if w = ε, then wR = ε; otherwise, ifw = w1 . . . wk, then wR = wk . . . w1.

Exercise: Prove that for every two strings x and y, (xy)R = yRxR.

Page 3: Alphabets, Strings, Languages - York University · CSE2001, Fall 2006 3 Definition: A string w over some alphabet Σ is called a palindrome if w = wR. • Definition: A language

CSE2001, Fall 2006 3

Definition: A string w over some alphabet Σ is called a palindrome if w = wR.

• Definition: A language over an alphabet Σ is a set of strings over Σ.

Notation: L, M, N, . . . for languages. |L| for the size (number of strings) of L.

Notation: Σ∗ will denote a set of all strings over Σ. Then, a language L over Σ is justa subset of Σ∗.

Notation: Σk will denote a set of all strings of length k over Σ.

• Operations on languages:

– Union: L ∪ M = {w|w ∈ L or w ∈ M}.

– Intersection: L ∩ M = {w|w ∈ L and w ∈ M}.

– Subtraction: L − M = {w|w ∈ L and w /∈ M}.

– Complementation: L̄ = {w|w ∈ Σ∗ − L}

– Concatenation: L · M = {wx|w ∈ L and x /∈ M}.

Note: if L and M are over different alphabets, say Σ1 and Σ2, then the resultinglanguage can be taken to be over Σ1 ∪ Σ2. In this course, it will not happen.

Notation: we will write LM instead of L · M . Also, concatenation is associative(i.e. L(MN) = (LM)N), so we drop the brackets and write LMN .

Note: for any language L, ∅L = L∅ = ∅.

And also {ε}L = L{ε} = L.

Page 4: Alphabets, Strings, Languages - York University · CSE2001, Fall 2006 3 Definition: A string w over some alphabet Σ is called a palindrome if w = wR. • Definition: A language

CSE2001, Fall 2006 4

– Lk = LL . . . L︸ ︷︷ ︸

k times

for k > 0; L0 = {ε}.

Exercise: Prove that for any language L, and any m ≥ 0, n ≥ 0, LmLn = Lm+n.

– Kleene closure (or star closure): L∗ = L0 ∪ L1 ∪ L2 ∪ . . . .

– Reverse: LR = {wR|w ∈ L}.