Permutations, Combinations & Probability

Posted by James Mead Tue, 04 Nov 2008 00:27:06 GMT

This weekend I spent an enjoyable couple of hours trying to solve a very simple sounding problem…

Given a set of four digits chosen at random (0-9 equally likely in each case), and a second set of four digits similarly chosen at random, on average how many digits will match (order is not important).

Update: Although I said that 0-9 is equally likely in each case, I didn’t make it clear that repetitions are allowed i.e. 1234 vs 5432 => 3 matches

Update 2: I just ran another brute force approach using randomly generated sequences and got a very similar answer (approx 1.21) which gives me a bit more faith in my other attempt

I spent quite a while trying to dredge my memory (and the web) about permutations & combinations, but never really got to grips with the problem. In the end, I hacked together a very nasty (probably inefficient & un-idiomatic) chunk of Ruby to generate all the possible permutations of eight digits and count how many duplicates there were between the first four digits and the last four digits. From these I think you can work out the probability of one match, the probability of two matches, etc. Then I think you can calculate the average number of matches by weighting each probability by the number of matches. I may well be wrong!

In any case, I then generalized the Ruby to generate the results for different numbers of digits and different numeric bases (e.g. base 4 => 0, 1, 2, 3). I was hoping to see patterns in the results, so I could possibly reverse engineer a formula for the answer, but I haven’t managed to spend much time on that bit. But I’m sufficiently interested to embarrass myself by posting this on the internet and hoping that someone will tell me how to do this in one line of Maths or, more likely, one line of Ruby.

  max_places = 4 
  puts (['base', 'places', ''] + (0..max_places).to_a).join("\t")

  (2..10).each do |base|
    (1..max_places).each do |places|
      counts = [0] * (places + 1)

      (0...(base ** (places * 2))).each do |index|
       permutation = "%0#{places * 2}d" % index.to_s(base)
       original = (0...places).map { |i| permutation[i].chr }
       guess = (places...(places * 2)).map { |i| permutation[i].chr }
       matches = 0 
       guess.each do |g| 
         (0...places).each do |i| 
           if g == original[i]
             original[i] = nil 
             matches += 1
             break
           end 
         end 
       end 
       counts[matches] += 1
      end 

      puts (["%2d" % base, places, ''] + counts).join("\t")
      $stdout.flush
    end 
  end

I think the answer for four digits between 0-9 comes out as follows:-

  • Exactly zero matches = 19550250
  • Exactly one match = 45608400
  • Exactly two matches = 29292120
  • Exactly three matches = 5373360
  • Exactly four matches = 175870

Which gives an average number of matches = 1.21

However, looking at it now, I’m wondering how there could be so many permutations with exactly four matches. Oh, well. I might as well publish anyway and find out what I’ve done wrong…

In amongst it all I found a couple of nice Ruby methods I hadn’t come across before:-

Tags , , ,  | 4 comments

Comments

  1. Derek said about 15 hours later:

    I don’t think this has to do with permutations/combinations at all.

    Take, for instance, a single digit in both cases. What is the chance that a digit 1-9 will match another digit 1-9? The problem in question layers on some extra complexity, and I wish I could work on it, but I gotta get to work.

    But it looks like this is purely statistics… no discrete maths needed.

  2. Jamie Macey said about 16 hours later:

    Given a set of four digits chosen at random (0-9 equally likely in each case), and a second set of four digits similarly chosen at random, on average how many digits will match (order is not important).

    Yes, it’s just basic math. The digits are independent, so for each position there’s a 10% chance for matching, 90% chance for no match.

    Looks like there’s only 0.4 digits matching on average for four digits, though.

  3. Floehopper said about 24 hours later:

    Derek: Thanks for your interest.

    Jamie: Thanks for having a go. I think your line of reasoning is similar to my initial attempt, but (although I may well be wrong) I don’t think it takes into account the fact that the digits can match in any position i.e. 1234 vs 2345 => 3 matches.

  4. Floehopper said about 24 hours later:

    See updates above. I’m now more confident in my answer.

Comments are disabled