Socks

Gengulphus · #334476

Gengulphus wrote:More seriously, a somewhat more challenging problem: for which numbers of socks, some black and some white, is it possible for the probability that two socks drawn at random are both white to be exactly 1/2? And for such collections of socks, how large can the probability that two socks drawn at random are both black be made?

It's about time I gave the answer to this. UncleEbenezer has basically got it right, but only as a computer-generated enumeration of the lowest numbers of socks involved and some speculations about how they continue - very likely-looking speculations (which are indeed true), but without fully proving them. In case anyone still wants to try doing this on their own, I'll spoiler-protect what follows, or at least most of it (spoiler protection doesn't work on one part, but that part might reasonably be regarded as an additional minor hint to anyone still working on it!)...

Looking at UncleEbenezer's enumeration of particular solutions, it appears that each step from one to the next grows the numbers by nearly the same factor and that the factors are asymptotically approaching a limit:

(s,w) = (4,3)
(s,w) = (21,15): s grows by a factor of 21/4 = 5.250000...
(s,w) = (120,85): s grows by a factor of 120/21 = 5.714285...
(s,w) = (697,493): s grows by a factor of 697/120 = 5.808333...
(s,w) = (4060,2871): s grows by a factor of 4060/697 = 5.824964...
(s,w) = (23661,16731): s grows by a factor of 23661/4060 = 5.827832...
(s,w) = (137904,97513): s grows by a factor of 137904/23661 = 5.828325...
(s,w) = (803761,568345): s grows by a factor of 803761/127904 = 5.828409...
(s,w) = (4684660,3312555): s grows by a factor of 4684660/803761 = 5.828424...
(s,w) = (27304197,19306983): s grows by a factor of 27304197/4684660 = 5.828426...
(s,w) = (159140520,112529341): s grows by a factor of 159140520/27304197 = 5.828427...
(s,w) = (927538921,655869061): s grows by a factor of 927538921/159140520 = 5.828427...

Furthermore, w is growing by about the same factor each time, and so is b = s-w. In other words, the solutions seem to be growing close-to-exponentially. This will be the motivation for a later step in what I do below.

It's clear that the sequence of possible values 1, 6, 35, 204, ... for b is generally quite a bit smaller than the corresponding sequences for w (3, 15, 85, 493, ...) or s (4, 21, 120, 697, ...), and so a search for possible values of b can be expected to get more 'hits' than a similar search for possible values of w or s. So I rewrote bluedonkey's equation s(s-1) = 2w(w-1) by substituting s = w+b in it, getting (w+b)(w+b-1) = 2w(w-1). Expanding the products, that is:

w^2 + 2bw + b^2 - w - b = 2w^2 - 2w

or after cancelling equal terms on both sides and reversing the equality:

w^2 - w = 2bw + b^2 - b

For any particular value of b we might be considering, this is equivalent to the quadratic equation:

Aw^2 + Bw + C = 0

in w for A=1, B=-1-2b, C = b-b^2. The well-known formula for solutions of a quadratic gives:

w = (-B +/- sqrt(B^2-4AC))/2A

= (2b+1 +/- sqrt((-2b-1)^2 + 4(b^2-b)))/2

= (2b+1 +/- sqrt(8b^2+1))/2

We're only interested in positive integer solutions for w, so any value of b for which 8b^2+1 is not an exact square won't produce any w values of interest. Conversely, 8b^2+1 is odd, so if it is an exact square, sqrt(8b^2+1) will be odd, and since 2b+1 is also odd, adding or subtracting the two will produce an even integer, and so the possibilities for w will be integers.

If b=0, the formula gives w=0 or 1. That makes s = w+b = 0 or 1, and (s,w) = (0,0) or (1,1) are both valid solutions of bluedonkey's equation s(s-1) = 2w(w-1), but not really of (my extension of) the socks problem, because it's impossible to withdraw two socks at random from a set of less than two socks! So we should only consider strictly positive values of b, and for those, sqrt(8b^2+1) >= sqrt(4b^2+4b+1) = 2b+1, with equality only if b=1. So the solution w = (2b+1 - sqrt(8b^2+1))/2 is only possibly of interest if b=1, but it gives w=0 and so s = b+w = 1, and while (s,w) = (1,0) is also a solution to bluedonkey's equation, it is again not a solution to the socks problem for exactly the same reason.

So basically, the socks problem is equivalent to asking for positive values of b such that 8b^2+1 is an exact square, then for each value of b, generating the corresponding values of w = (2b+1 + sqrt(8b^2+1))/2 and s = w+b. So essentially, the problem is equivalent to finding pairs of squares such that the larger is one more than 8 times the smaller.

More generally, the problem of finding pairs of squares such that the larger is one more than N times the smaller (for N not itself a square) is known as Pell's equation and there's a lot of mathematical theory behind it, which is described in the link but I won't go into much of it here. The bit I will go into is that if (p,q) and (P,Q) are two solutions for a particular value of N, i.e. if p^2 = Nq^2+1 and P^2 = NQ^2+1, then so is (pP+NqQ,pQ+Pq), because:

(pP+NqQ)^2 = p^2*P^2 + NpPqQ + N^2*q^2*Q^2
= (Nq^2+1)*P^2 + NpPqQ + N*(p^2-1)*Q^2, using p^2 = Nq^2+1 twice,
= Nq^2*P^2 + P^2 + NpPqQ + Np^2*Q^2 - N*Q^2
= N(p^2*Q^2 + pPqQ + q^2*P^2) + (P^2 - N*Q^2)
= N*(pQ+Pq)^2 + 1, using P^2 = NQ^2+1

Furthermore, (p0,q0) = (1,0) is obviously always a solution, and if (p1,q1) is the smallest positive solution, then the sequence we get from repeatedly applying the above to produce (p_(i+1), q_(i+1)) = (p1*pi+N*q1*qi, p1*qi+pi*q1) contains all the positive solutions. (The fact that they're positive solutions is a simple consequence of the above; the fact that they're all of the positive solutions can be proved by supposing that (p,q) lies strictly between (p_i,q_i) and (p_(i+1),q_(i+1)) and showing that if we repeatedly apply the above to (p,q) and (p1,-q1), we'll eventually get down to a solution strictly between (1,0) and (p1,q1) - which isn't possible because (p1,q1) is the smallest positive solution.)

That basically means that if we can find the smallest positive solution to Pell's equation for a particular N, we can generate the positive solutions as far as desired. That "if" is an important one, as the smallest positive solution can involve numbers much bigger than N, but in the particular case N=8 that we're looking at, the smallest positive solution is very easy to find: it's (3,1), since 3^2 = 8*1^2 + 1. So that gives us the sequence of possible values of b in the socks problem: it is identical to the sequence of q values generated by starting with the Pell's equation solution (p,q) = (1,0) and repeatedly applying the transformation (p,q) --> (3p+8q,3q+p):

(1,0) --> (3*1+8*0, 3*0+1) = (3,1)
--> (3*3+8*1, 3*1+3) = (17,6)
--> (3*17+8*6, 3*6+17) = (99,35)
--> (3*99+8*35, 3*35+99) = (577,204)
--> (3*577+8*204, 3*204+577) = (3363,1189)
--> ...

Now I'm finally ready for the hint at the start of this about the values appearing to grow close to exponentially. Just suppose that there were solutions to (p_(i+1),q_(i+1) = (3p_i+8q_i,3q_i+p_i) that really do grow exponentially - what would they be? They would be of the form (p_i,q_i) = (P*f^i,Q*f^i) for some real numbers P, Q and f, and they would have to satisfy:

(P*f^(i+1), Q*f^(i+1)) = (p_(i+1), q_(i+1))
= (3p_i+8q_i, 3q_i+p_i)
= ((3P+8Q)*f^i, (3Q+P)*f^i)

for all i. Dividing through by f^i, that will be the case if (P*f, Q*f) = (3P+8Q, 3Q+P), i.e. if f = (3P+8Q)/P = (3Q+P)/Q, or multiplying through by PQ, if Q(3P+8Q) = P(3Q+P). Subtracting 3PQ from both sides, that gives 8Q^2 = P^2, i.e. P = +/- sqrt(8)*Q, and then substituting that back into f = (3Q+P)/Q gives f = 3 +/- sqrt(8). So we've got two families of exactly exponentially growing solutions to (p_(i+1),q_(i+1) = (3p_i+8q_i,3q_i+p_i), namely:

(p_i, q_i) = (sqrt(8)*Q*(3+sqrt(8))^i, Q*(3+sqrt(8)^i))

and:

(p_i, q_i) = (-sqrt(8)*Q*(3-sqrt(8))^i, Q*(3-sqrt(8)^i))

Furthermore, any sum of members of the two families:

(p_i, q_i) = (sqrt(8)*Q1*(3+sqrt(8))^i - sqrt(8)*Q2*(3-sqrt(8))^i,
Q1*(3+sqrt(8))^i + Q2*(3-sqrt(8))^i)

will still be a solution to (p_(i+1),q_(i+1) = (3p_i+8q_i,3q_i+p_i), just not an exponentially growing one - though as 3+sqrt(8) is quite a lot bigger than 1 and 3-sqrt(8) quite a lot smaller than 1, we can reasonably expect such a sum to quite rapidly approach growing by a factor of 3+sqrt(8) = 5.82842712474619009760... each time we move on to the next solution - which is the sort of behaviour observed in UncleEbenezer's enumeration.

We know that we're interested in a solution to (p_(i+1),q_(i+1) = (3p_i+8q_i,3q_i+p_i) with (p_0,q_0) = (1,0), and the above formula for the sum of members of the two exponentially growing families of solutions gives:

(p_0, q_0) = (sqrt(8)*Q1*(3+sqrt(8))^0 - sqrt(8)*Q2*(3-sqrt(8))^0,
Q1*(3+sqrt(8))^0 + Q2*(3-sqrt(8))^0)

= (sqrt(8)*(Q1-Q2), Q1+Q2)

So if we make Q1-Q2 = 1/sqrt(8), Q1+Q2 = 0, or equivalently Q1 = 1/sqrt(32), Q2 = -1/sqrt(32), we'll get a solution to (p_(i+1),q_(i+1) = (3p_i+8q_i,3q_i+p_i) which starts correctly with the not-entirely-positive integer-valued (1,0) solution to Pell's equation for N=8, and the fact that it is a solution to (p_(i+1),q_(i+1) = (3p_i+8q_i,3q_i+p_i) implies that despite the large number of square roots floating around, its values are all integers. So the positive solutions to Pell's equation are:

(p_i, q_i) = (((3+sqrt(8))^i + (3-sqrt(8))^i)/2, ((3+sqrt(8))^i - (3-sqrt(8))^i)/sqrt(32))

for i = 1, 2, 3, ..., and so the values of b in my extension to the socks problem are the second components of that:

b_i = ((3+sqrt(8))^i - (3-sqrt(8))^i)/sqrt(32)

for i = 1, 2, 3, ...

The values of sqrt(8b_i^2+1) are the first components ((3+sqrt(8))^i + (3-sqrt(8))^i)/2, and the values of w_i are (2b_i+1 + sqrt(8b_i^2+1))/2, or:

w_i = (((3+sqrt(8))^i - (3-sqrt(8))^i)/sqrt(8) + 1 + ((3+sqrt(8))^i + (3-sqrt(8))^i)/2)/2
= (1/4+1/sqrt(32)) * (3+sqrt(8))^i + (1/4-1/sqrt(32)) * (3-sqrt(8))^i + 1/2

And s_i = w_i + b_i, so:

s_i = (1/sqrt(8)+1/4) * (3+sqrt(8))^i - (1/sqrt(8)-1/4) * (3-sqrt(8))^i + 1/2

To finish, these imply that b_i / (3+sqrt(8))^i = (1 - ((3-sqrt(8))/(3+sqrt(8)))^i)/sqrt(32) tends to 1/sqrt(32) as i tends to infinity, and similarly (b_i-1) / (3+sqrt(8))^i tends to 1/sqrt(32) and s_i / (3+sqrt(8))^i and (s_i-1) / (3+sqrt(8))^i both tend to 1/sqrt(8) + 1/4 as i tends to infinity. So the probability (b_i*(b_i-1))/(s_i*(s_i-1) of two black socks = ((b_i / (3+sqrt(8))^i) * ((b_i-1) / (3+sqrt(8))^i)) / ((s_i / (3+sqrt(8))^i) * ((s_i-1) / (3+sqrt(8))^i)) tends to (1/sqrt(32))^2 / (1/sqrt(8)+1/4)^2 = (1/32) / (1/8+1/sqrt(32)+1/16) = 1/(6+4sqrt(2)) = 1.5-sqrt(2) as i tends to infinity. Together with UncleEbenezer's argument earlier in the thread that that probability must remain below 1.5-sqrt(2), this implies that it is indeed asymptotic to 1.5-sqrt(2) - it can be brought arbitrarily close to 1.5-sqrt(2) but never quite to it.

Whew! Must admit I hadn't remembered just how much detail needs to be got right in that sort of argument when I posed the question!

Gengulphus

The Lemon Fool

Donate to Remove ads

Got a credit card? use our Credit Card & Finance Calculators

Socks

Re: Socks

Who is online