Sunday 30 March 2014

Sequence of random variables of smaller variance in average

Define the problem as follows.

Suppose we have a sequence of random variables $X_1, X_2,...$ that takes value $\left\{ 0,1\right\}$ with $\lim _{n\rightarrow \infty} \frac{\sum_{i=1}^n X_i}{n} = 0.5$. i.e. In long term this is a fair coin toss. Design an mechanism so that for some $n$ and $k$, the distribution of $\sum_{i=n}^{n+k-1} X_i$ has a smaller variance.

It is very clear that if the variables are independent then the distribution is always fixed. Therefore we focus on dependent variables, in particular we consider creating a Markov Chain, that $X_{n+k}$ depends on $\frac{\sum_{i=n}^{n+k-1} X_i}{k}$. We call this dependence function be $f: [0,1]\rightarrow [0,1]$, which is rotational symmetric around $(0.5,0.5)$.

In order to investigate the effectiveness of our mechanism we have to go deep into the calculations because the mean calculations won't work (we need variance!)

Consider $k=2$ with $f(x) = 0.9 - 0.8x$ and the four states are $00,01,10,11$. Then the transition matrix is given by:

$P = \begin{bmatrix} 0.1&0&0.5&0\\0.9&0&0.5&0\\0&0.5&0&0.9\\0&0.5&0&0.1\end{bmatrix}$

Where the transition matrix assuming independence equal to

$P = \begin{bmatrix} 0.5&0&0.5&0\\0.5&0&0.5&0\\0&0.5&0&0.5\\0&0.5&0&0.5\end{bmatrix}$

Since $P$ represents a regular Markov chain it's clear that it has an eigenvalue 1 with the steady state vector $\frac{5}{28}(1,1.8,1.8,1)^T$. Comparing with $(0.25,0.25,0.25,0.25)^T$ this one has a high tendency at the central.

What about higher k? Consider the case $k=3$ with 8 states $000,001,010,011,100,101,110,111$, where

$P =\begin{bmatrix} \frac{1}{10}&0&0&0&\frac{11}{30}&0&0&0\\ \frac{9}{30}&0&0&0&\frac{19}{30}&0&0&0\\ 0&\frac{11}{30}&0&0&0&\frac{19}{30}&0&0\\ 0&\frac{19}{30}&0&0&0&\frac{11}{30}&0&0\\ 0&0&\frac{11}{30}&0&0&0&\frac{19}{30}&0\\ 0&0&\frac{19}{30}&0&0&0&\frac{11}{30}&0\\ 0&0&0&\frac{19}{30}&0&0&0&\frac{9}{10}\\ 0&0&0&\frac{11}{30}&0&0&0&\frac{1}{10}\\\end{bmatrix}$

With steady state vector, according to mathematica, $c(0.161905, 0.397403, 0.397403, 0.397403, 0.397403, 0.397403, 0.397403, 0.161905)^T$ (where $c$ is the constant to make the sum of the probabilities 1.) Surely there is a higher tendency to stay at the centre, but it looks less centre-located than $k=2$. Of course it's very hard to determine the variance through the steady state vector from the Markov chain as every states represents a series of random variables, but we can plot the distribution for it as follows. Here we sample 10000 times for $\sum_{i=1}^{100} X_i$.



Here we see the black line as the standard binomial distribution, red as $k=2$, green as $k = 8$, blue as $k=25$ and mathematically the line tends to the standard binomial distribution as $n \rightarrow \infty$. If we take a function that is closer to $f(x) = 0.5$ (take the distance in inner product space, if you like) it will, of course, closer to the original distribution.


For the above graph, red refers to $y = 0.9-0.8x$, green is $y = 0.8-0.6x$ while blue is $0.7-0.4x$, and the difference is obvious.

Convexity also affect the ability of 'forcing' the RV back to the mean using the same idea (distance between functions). The variance is naturally smaller if $f(x)$ is larger approaching to $0.5^-$ and smaller from $0.5^+$. If you do some Fourier analysis we can 'force' the random variable sequence into a perfect alternating 0 1 0 1 0 1 0 1!!!


The red line refers to $f(x) = 0.5 + 0.4((1-x)^5-x^5))$, the green line being $f(x) = 0.9-0.8x$ and the blue line is $f(x) = 0.5+0.4\cos (\pi x)$.

Let's demonstrate the Fourier series correlation here. For the function $f: [0,1]\rightarrow [0,1]$ where $f(x) = 1$ for $x\in [0,0.5]$ and $f(x) = 0$ for $x\in (0.5,1]$ we have $f(x) = \frac{1}{2} + \sum_{k=1}^{\infty}\frac{2\sin ((4n-2)\pi x)}{(2n-1)\pi}$. Turncating the values that exceeds 1 and below 0 we have the following distribution by taking $1,2,3$ terms with $k=15$:



We put a large graph here because it takes great effort for us to separate the three lines. Zooming gives



We see (once more) that Fourier series converges very quick despite the occurrence of Gibbs' phenomenon here. There's an interesting question that why does the distribution converges to something non-trivial instead of going to the impulse $\delta (\frac{n}{2})$, where the sequence is perfectly 1 0 1 0 ......? I'll leave this question to the reader as the solution is quite simple.

I don't intend to 'teach' something here but creating a sequence of dependent random variables is a very practical topic and the variance could play an important role. For instance in some round-based RPG battle you certainly don't want your game to be dominated by randomness, so it makes sense to restrict the randomness so that it's about the same for both sides over most if not all short periods. The faster converging speed is what we want to tweek the variance.

Foods for thought.

1) Why does the distribution using Fourier series as $f(x)$ does not converge to an impulse, where the sequence of distribution has no variance? You may want to change the following variables to see what happens (a) $n$, the length of sequence simulated (b) $k$ the length of correlation (c) the number of terms in the Fourier series or the turncation (d) numerical precision.

2) It is possible to calculate the variance directly from the Markov chain steady state probability?

3) By inverting  the idea we can't create sequence of RVs of larger variance directly, for instance, $f(x) = 0.5-0.4\cos (\pi x)$ doesn't give a sequence of larger variance. Think about if it is possible, then see if there are practical applications for it.

4) For the Markov Chain case $k=3$, why does the steady state probability for all 6 states except $000,111$ are equal? (i.e., That looks 'unusual'.)

R code, for $f(x)$:

f  = function(n,k,g) {
  s = NULL
  for(i in 1:k) {
    s = c(s,rbinom(1,1,0.5))
  }
  for(i in (k+1):n) {
    prev_mean = mean(s[(i-k):(i-1)])
    s = c(s,rbinom(1,1,g(prev_mean)))
  }
  return(sum(s))
}

Saturday 1 March 2014

Triangle centres and choice of coordination

Recently I have a few tutorial sessions with others on NSS maths and olympiad maths and there are a few questions on geometry which is not very easy to be solved by analytical method. With or without extra restrictions we may be able to solve them in coordinate system, but sometimes we don't know where should we start our calculation. What should we do to make these calculations much simplified?

Before doing this we can review a few operations in coordinate geometry in senior secondary school level.

1) Straight lines, in general form $y = mx+b$, or point slope form $(y-y_0) = m(x-x_0)$.

For parallel translation consider $y= mx + b + \epsilon $ or $(y- y_0- \epsilon _y ) = m(x-x_0 - \epsilon)$.

For perpendicular lines consider lines with slope $-m^{-1}$.

2) Centres of triangle. They are discussed in the previous entry.

3) Locus. Common locus includes straight line(s), circle and conic sections. The standard form can be generalized by $a(x-x_0)^2 + b(y-y_0)^2 = r^2$.

Mathematics is consistent (at this level) in the sense that using different methods will give you the same answer. However there are many approaches, in particular there are many method to define a geometry problem in the coordinate plane.

To define a problem in the coordination system first we define the coordinate system to be used. For 2D geometry the three choices are Cartesian, polar and Argand coordinate. Cartesian is the obvious choice as polar produces ALOT of trouble when you operate points besides the origin (we will see this later), and argand deals more with dynamic stuffs like transitions, but we don't want things to move. In the static case Argand is similar to Cartesian but the later can be much easier.

The second step is to put the points into the coordinate. You can rotate and transform freely as long as this does not violate the original given information. A very common transformation is to put two points of the triangle at (0,0) and (0,1), then the last point can be placed at $(x,y) \in {\mathbb{R}^+}^2$. (This could lead to a paradox, but this is beyond our scope.)

The third step is to deduce the transformed relation in the new coordinate, which is technically the most complicated part. You will find angle bisector giving you so much pain. Consider the following example.

Example 1. Given that $\tan (\theta ) = k$. Show that $\tan (\frac{\theta}{2}) = \frac{k}{\sqrt{1+k^2}+1}$.

This can be performed on a right-angled triangle, then the general case immediately follows. (why?)

Consider three points $A(0,0)$, $B(1,0)$ and $C(0,k)$ (since this is a right-angled triangle) and $D$ be the point of intersection of $AC$ and the angle bisector of $B$.

By angle bisection theorem, $\frac{DA}{DC} = \frac{BA}{BC}$ so that $DA = \frac{k}{\sqrt{1+k^2}+1}$ and the result follows.

Example 2. Verify that $r = \frac{a+b-c}{2}$ where $a,b,c,r$ is the sidelengths and inradius respectively if (and in fact, only if), $ABC$ is a right-angled triangle.

We had a proof using Pyth. them to get the result directly, in a much easier way, but this time we try to use a purely coordinate-geom approach.

Here we take the same set up with example 1. Then the angle bisector of $A$ is given by $y=x$ (if the triangle is not right-angled the formulation of this one will be complicated too). The slope $m_{CB} = -k$ and the angle bisector of $B$ has slope $m_{B'} = \frac{-k}{\sqrt{1+k^2}+1}$. Then we have two equations to find the incenter (two is enough, assuming the fact that angle bisectors are concurent): $y = x$ and $y = m_{B'}(x-1)$, which gives us $y = x = r = \frac{k}{1+k+\sqrt{1+k^2}}$. $r=y$ because the foot from the incenter to $AB$ is vertical.

Notice that all right angled triangle can be transformed into the above form, then $r =  \frac{k}{1+k+\sqrt{1+k^2}} = \frac{1+k-\sqrt{1+k^2}}{2}$ can be shown by rationalization.

When you have an isosceles triangle you may want it to be symmetrical in your coordinate system.

Example 3. (PCIMC 2014 Heat SS Q17) In triangle ABC, $AB=AC$. $D$ is on $AB$ so that $AD:DB = 2:5$. Extend $BC$ to $E$ so that $DE = EC$. If the area of triangle DEB is 10, find the area of triangle ABC.

Note that in this question transformation does not change the given information. Notice that there is a ratio $2:5$ so we may want to put $A$ at $(0,7)$ and do the rest of the calculations. However this question can be solved easily with rotation only (rotation do not alter ANY information, unless they give you a pre-assigned coordinate with slopes, etc.). Put $A$ at $(0,a)$ with $C(0,x)$, $B(0,-x)$. Then $D$ is at $(-\frac{2}{7}x, \frac{5}{7}a)$ (why?) and by symmetry $E$ is at $(\frac{-11}{7}x,0)$.

Now consider the area of triangle DEB we have $\frac{1}{2} \frac{4x}{7} \frac{5a}{7} = 10$, which gives us the area of triangle ABC $\frac{(2x)(a)}{2} = 49$.

Readers may want to try the case when $a$ is fixed. Now we perform one classic example: the collinearity of centres of triangle. It may look hard but this is not that hard as one imagined as: (1) it does not involve the stupid incenter, with all those complicated trigonometric functions (2) we use a nice coordination to make calculation easier.

By arbitrary transformation consider the triangle $A(0,0)$, $B(1,0)$ and $C(a,b)$ at the first quadrant.

The centroid is easy: $G(\frac{a+1}{3}, \frac{b}{3})$.

For the orthocenter, notice that $m_{BC} = \frac{y}{x-1}$ so that the altitude from $A$ to $BC$ has slope $\frac{1-x}{y}$. Moreover the altitude from $C$ to $AB$ is vertical so that we have two equations: $x=a$ and $y = \frac{1-a}{b}x = \frac{a(1-a)}{b}$ so that the orthocenter is at $H(a,\frac{a(1-a)}{b})$.

For the circumcenter, note that the perpendicular bisector of $AB$ is $x=\frac{1}{2}$. Now the perpendicular bisector of $BC$ has equation $y = \frac{b}{2} + \frac{1-a}{b}(x-\frac{a+1}{2})$. Solving the equation we have the circumcenter $O(\frac{1}{2}, \frac{b^2 - a(1-a)}{2b})$.

The coordinate of the circumcenter may not look good, but we can verify that the slope of the three points are equal to $\frac{-3a^2+3a-b^2}{(2a-1)b}$ so that they are collinear. Are there any better coordination? Please don't hesitate to tell me.

Food for thoughts:
1. (PCIMC 2014 Heat SS Q12) In triangle ABC, D is the mid-point of BC. E and F are on AB, AC respectively so that BC//EF. If $S_{AEF}:S_{DEF} = 7:3$ (area ratio) and $S_{ABC} = 100$, find $S_{DFC}$.

2. (PCIMC 2013 Final SS Q14) For a kite ABCD, $AB = AD = 50$, $CB = CD = 75$. E,F,G,H are on AB, BC, CD, DA respectively so that EFGH is a square. Moreover let angle $ABC, ADC$ be right angle and EF//AC. Find the area of the square.