Matlab - An Introduction to Combinatorics
This is a guest post by Will Dwinnell. Will Dwinnell is a senior-level data miner who shares with us a common love for Matlab. Daniel and I have enjoy reading his blog, and we are extremely excited to introduce Will to our readers.
Introduction
MATLAB is used in a many fields and consequently includes a diverse array of functions. Many useful functions go neglected by programmers who are unfamiliar with them. This article will briefly explore three functions which may come in handy from time to time, even if they never become everyday workhorses. Specifically, these functions deal with combinations of items.
The “perms” Command
The perms function accepts a list of items, and returns all possible orderings of these items. You may remember this from high school as a permutation problem. Here’s a simple example:
>> perms([1:3])
ans =
3 2 1
3 1 2
2 3 1
2 1 3
1 2 3
1 3 2
This function is simple enough, but spares one managing the loops this would otherwise require. One nice thing about perms is that it will accepts non-numeric data as cell array lists:
>> perms({'red', 'green', 'blue'})
ans =
'blue' 'green' 'red'
'blue' 'red' 'green'
'green' 'blue' 'red'
'green' 'red' 'blue'
'red' 'green' 'blue'
'red' 'blue' 'green'
This function might be useful in resource assignment problems, such as assigning sports teams their preferences of players, fields, etc., with teams selecting in the indicated order. Each time this round of selections takes place, the next row would be used.
The “randperm” Command
Many problems require the shuffling of a list of items. This comes up in randomizing the order of examples for machine learning systems, shuffing virtual cards in game programs and simulations. This can be accomplised using MATLAB’s widely used rand and sort functions:
>> MyList = [1:6]'
MyList =
1
2
3
4
5
6
>> [R Index] = sort(rand(6,1))
R =
0.1419
0.4218
0.4854
0.8003
0.9157
0.9572
Index =
4
5
2
3
6
1
>> MyList = MyList(Index)cle
MyList =
4
5
2
3
6
1
This is all unnecessarily complex, though, since randperm will do this in a single step:
>> MyList = [1:6]'
MyList =
1
2
3
4
5
6
>> MyList = MyList(randperm(6))
MyList =
4
3
1
5
6
2
randperm(n) returns the integers from 1 to n, in a random order:
>> randperm(8)
ans =
8 6 4 5 1 7 3 2
Note that randperm is driven by the same random number generator that feeds rand, so calling one will affect the subsequent output of the other, and both are seeded the same way.
The “nchoosek” Command
Pop quiz! How many ways are there to draw a hand of 5 cards from a deck of 52 cards, ignoring their order? I don’t remember how to calculate this either, but the nchoosek function will do it for you, and more!
When given scalar arguments, nchoosek(n,k) will return the number of combinations of n items, taken k at a time. For instance, if one needed to know how many ways 2 things could be drawn from a set of 6:
>> nchoosek(6,2)
ans =
15
Sometimes, though, what is needed is a list of the actual combinations. With a vector for its first argument, nchoosek will return all possible combinations:
>> nchoosek(1:6,2)
ans =
1 2
1 3
1 4
1 5
1 6
2 3
2 4
2 5
2 6
3 4
3 5
3 6
4 5
4 6
5 6
As with perms, this saves potentially much looping and other headaches for the MATLAB programmer. Also like perms, nchoosek will accept a cell array of strings as the list of items:
>> nchoosek({'chicken', 'fish', 'pizza', 'beef', 'soup', 'ratatouille'},2)
ans =
'chicken' 'fish'
'chicken' 'pizza'
'chicken' 'beef'
'chicken' 'soup'
'chicken' 'ratatouille'
'fish' 'pizza'
'fish' 'beef'
'fish' 'soup'
'fish' 'ratatouille'
'pizza' 'beef'
'pizza' 'soup'
'pizza' 'ratatouille'
'beef' 'soup'
'beef' 'ratatouille'
'soup' 'ratatouille'
Incidentally, there are over two and a half million distinct 5-card hands from a deck of 52:
>> nchoosek(52,5)
ans =
2598960
Closing Thoughts
Take care not to apply perms and nchoosek to calculations which are too large: the number of theoretically possible combinations or permutations can grow quite large.
It is often helpful to be familiar with parts of MATLAB which you may use less frequently. My recommendation is to occasionally glance through the help facility. Calling help by itself yields a list of sub-topics, such as elfun (elementary math), graph3d (”three dimensional graphs”) and strfun (”character strings”). Accessing help for any of these provides a list of functions which you should explore for future use.
About the Author
Will Dwinnell is a senior-level data miner who writes the Data Mining in MATLAB Web log. We suggest you visit his site and see what he has to offer. There are some very interesting articles there!