Permutations, Combinations & Probability
Posted by James Mead Tue, 04 Nov 2008 00:27:06 GMT
This weekend I spent an enjoyable couple of hours trying to solve a very simple sounding problem…
Given a set of four digits chosen at random (0-9 equally likely in each case), and a second set of four digits similarly chosen at random, on average how many digits will match (order is not important).
Update: Although I said that 0-9 is equally likely in each case, I didn’t make it clear that repetitions are allowed i.e. 1234 vs 5432 => 3 matches
Update 2: I just ran another brute force approach using randomly generated sequences and got a very similar answer (approx 1.21) which gives me a bit more faith in my other attempt
I spent quite a while trying to dredge my memory (and the web) about permutations & combinations, but never really got to grips with the problem. In the end, I hacked together a very nasty (probably inefficient & un-idiomatic) chunk of Ruby to generate all the possible permutations of eight digits and count how many duplicates there were between the first four digits and the last four digits. From these I think you can work out the probability of one match, the probability of two matches, etc. Then I think you can calculate the average number of matches by weighting each probability by the number of matches. I may well be wrong!
In any case, I then generalized the Ruby to generate the results for different numbers of digits and different numeric bases (e.g. base 4 => 0, 1, 2, 3). I was hoping to see patterns in the results, so I could possibly reverse engineer a formula for the answer, but I haven’t managed to spend much time on that bit. But I’m sufficiently interested to embarrass myself by posting this on the internet and hoping that someone will tell me how to do this in one line of Maths or, more likely, one line of Ruby.
max_places = 4
puts (['base', 'places', ''] + (0..max_places).to_a).join("\t")
(2..10).each do |base|
(1..max_places).each do |places|
counts = [0] * (places + 1)
(0...(base ** (places * 2))).each do |index|
permutation = "%0#{places * 2}d" % index.to_s(base)
original = (0...places).map { |i| permutation[i].chr }
guess = (places...(places * 2)).map { |i| permutation[i].chr }
matches = 0
guess.each do |g|
(0...places).each do |i|
if g == original[i]
original[i] = nil
matches += 1
break
end
end
end
counts[matches] += 1
end
puts (["%2d" % base, places, ''] + counts).join("\t")
$stdout.flush
end
endI think the answer for four digits between 0-9 comes out as follows:-
- Exactly zero matches = 19550250
- Exactly one match = 45608400
- Exactly two matches = 29292120
- Exactly three matches = 5373360
- Exactly four matches = 175870
Which gives an average number of matches = 1.21
However, looking at it now, I’m wondering how there could be so many permutations with exactly four matches. Oh, well. I might as well publish anyway and find out what I’ve done wrong…
In amongst it all I found a couple of nice Ruby methods I hadn’t come across before:-
- Fixnum#to_s
- String#count, which I didn’t use in the end.

I don’t think this has to do with permutations/combinations at all.
Take, for instance, a single digit in both cases. What is the chance that a digit 1-9 will match another digit 1-9? The problem in question layers on some extra complexity, and I wish I could work on it, but I gotta get to work.
But it looks like this is purely statistics… no discrete maths needed.
Given a set of four digits chosen at random (0-9 equally likely in each case), and a second set of four digits similarly chosen at random, on average how many digits will match (order is not important).
Yes, it’s just basic math. The digits are independent, so for each position there’s a 10% chance for matching, 90% chance for no match.
Looks like there’s only 0.4 digits matching on average for four digits, though.
Derek: Thanks for your interest.
Jamie: Thanks for having a go. I think your line of reasoning is similar to my initial attempt, but (although I may well be wrong) I don’t think it takes into account the fact that the digits can match in any position i.e. 1234 vs 2345 => 3 matches.
See updates above. I’m now more confident in my answer.