ClassicalLinearCodes

Codes

Let $F$ be a finite field. A code $C$ with code length $n$ is just a subset of the vector space $F^n$ over $F$. When $C$ is a linear subspace of dimension $k$, it is called an $ [n,k]$ (linear) code, and $n-k$ is the redundancy. Otherwise, it is nonlinear. When $F=\mathbb Z_2$, $C$ is referred to as a binary code.

Hamming distance. For $u=(x_1,\dots,x_n),v=(y_1,\dots,y_n)\in F^n$, define $d(u,v)={}$ the size of $\{1\leq i\leq n\mid x_i\not=y_i\}$.

The minimal distance of a code $C$ is defined to be $d_C=\mathrm{min}\{d(u,v)\mid u,v\in C, u\not=v\}.$

Hamming weight. For $u\in F^n$, the Hamming weight of $u$ is defined to be $\mathrm{wt}(u)=d(u,0)$.

Lemma. $d(u,v)=d(u+w,v+w)$ for all $u,v,w\in F^n$.

Lemma. If $C$ is linear, then the minimal distance $d_C$ of $C$ is just the minimal Hamming weight of nonzero vectors of $C$.

If $C$ is a linear $ [n,k]$-code with minimal distance $d_C$, it is said to be an $ [n,k,d_C]$ code.

Theorem. If $d_C$ is the minimal distance of a code $C$, then $C$ can correct up to $\lfloor\frac{d_C-1}{2}\rfloor$ transmission errors.

Channels

A usual description of the noise channel that a code goes through is DMC: discrete memoryless channel. Here discrete means that there are only finitely many symbols used to compose codewords (codevectors), and memoryless means that an error in one symbol does not affect the reliability of the neighboring symbols.

Example. An $m$-ary symmetric channel: There are $m$ inputs and outputs, namely, $x_1$, $\dots$, $x_m$. The probability $p(x_i|x_j)$ that a transmitted symbol $x_i$ got changed to $x_j$, $i\not=j$, is fixed, $p$ say. That is, the outcome is independent of $j$ (so is called symmetric). Then $q=1-(m-1)p$ is the probability $p(x_i|x_i)$ that $x_i$ remain unchanged after going through the channel.

One often use the binary symmetric channel as a model: There are 2 symbols, 0 and 1, for inputs and outputs. The channel changes 0 to 1 and 1 to 0 with a probability $p$, and leaves 0 and 1 unchanged with a probability $1-p$.

Repetition Code

The easiest way of making a linear code is to repeat the symbols $n$ times. When a codevector is received, simply use majority count to determine what was the original codevector. So a binary repetition code is an $ [n,1,n]$ code, the code space is spanned by $(1,1,\dots,1)$.

 $ [7,4]$ Hamming Code

The $ [7,4]$ Hamming code is a $ [7,4,3]$ linear code capable of correcting one transmitting error defined as what follows. Let $V=\mathbb Z_2^7$ and $C$ the subspace of $V$ defined by the following equations: $(x_1,\dots,x_7)\in C$ if $x_4+x_5+x_6+x_7=0$, $x_2+x_3+x_6+x_7=0$, and $x_1+x_3+x_5+x_7=0$. When a codevector $(y_1,\dots,y_7)$ is received, one computes the numbers $\alpha=y_1+y_3+y_5+y_7$, $\beta=y_2+y_3+y_6+y_7$, and $\gamma=y_4+y_5+y_6+y_7$. If all three are zero, no error was made during the transmission, otherwise the number $\alpha+2*\beta+4*\gamma$ (in decimal system) tells the position where an error has occurred.