<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>vetta project &#187; Python</title>
	<atom:link href="http://www.vetta.org/tag/python/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.vetta.org</link>
	<description></description>
	<lastBuildDate>Tue, 31 Jan 2012 23:29:15 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
		<item>
		<title>Tokyo: A Cython BLAS wrapper for fast matrix math</title>
		<link>http://www.vetta.org/2009/09/tokyo-a-cython-blas-wrapper-for-fast-matrix-math/</link>
		<comments>http://www.vetta.org/2009/09/tokyo-a-cython-blas-wrapper-for-fast-matrix-math/#comments</comments>
		<pubDate>Tue, 15 Sep 2009 01:15:04 +0000</pubDate>
		<dc:creator>Shane Legg</dc:creator>
				<category><![CDATA[Programing]]></category>
		<category><![CDATA[BLAS]]></category>
		<category><![CDATA[Cython]]></category>
		<category><![CDATA[LAPACK]]></category>
		<category><![CDATA[Mathematics]]></category>
		<category><![CDATA[Matrix]]></category>
		<category><![CDATA[Python]]></category>
		<category><![CDATA[Scipy]]></category>

		<guid isPermaLink="false">http://www.vetta.org/?p=594</guid>
		<description><![CDATA[Prototyping mathematical code in Python with the Scipy/Numpy libraries and then switching to Cython for speed often works well, but there are limitations. The main problem that has been bugging me recently is the speed of matrix function calls. What &#8230; <a href="http://www.vetta.org/2009/09/tokyo-a-cython-blas-wrapper-for-fast-matrix-math/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>Prototyping mathematical code in Python with the Scipy/Numpy libraries and then switching to Cython for speed often works well, but there are limitations.  The main problem that has been bugging me recently is the speed of matrix function calls.  What happens is that your Cython code needs to compute, say the outer product of two vectors, and so makes a call to Numpy.  At this point everything switches to very slow interpreted Python code which does a few checks etc. before calling into the underlying fast BLAS library that does the actual work.  For large matrices the cost of this wrapping code wasn&#8217;t a big deal, but for small matrices it can be a huge performance hit.</p>
<p>To solve this problem I&#8217;ve created a Cython wrapper for many of the more common BLAS functions.  It&#8217;s called Tokyo: I often name code after cities and both Tokyo and BLAS/LAPACK were big, fast and very foreign to start with!  At the moment Tokyo only wraps the BLAS routines for vectors and general matrices with single and double precision.  If you want to add other things such as banded matrices, complex numbers or LAPACK calls: just look at what I&#8217;ve already done and add the functions you need.  I&#8217;ve also added a few extra functions that I find useful when doing matrix calculations.  The idea is that Tokyo will eventually encompass all of BLAS and LAPACK.<br />
<span id="more-594"></span></p>
<p>The speedup you can expect varies a lot depending on the size of the vectors and matrices involved and the operation being computed.  An outer product on length 4 vectors can be over 150 times faster, while a 20&#215;20 matrix multiplication is about the same speed.  After installing Tokyo, execute the single and double precision speed tests to see how much faster the Tokyo routines are compared to using Numpy on your machine.  For my own reinforcement learning research I&#8217;ve managed to get between 5 and 20 times speedups with various algorithms.</p>
<p>Here&#8217;s a simple example of a vector outer product, first in pure Python with Numpy:</p>
<p><code>import numpy as np</p>
<p>x = np.array( [1.0, 2.0, 3.0, 4.0] )<br />
y = np.array( [7.0, 8.0, 9.0, 0.0] )</p>
<p>print np.outer( x, y )<br />
</code><br />
And now a fast version in Cython using Tokyo:</p>
<p><code>import numpy as np<br />
cimport numpy as np<br />
import tokyo<br />
cimport tokyo</p>
<p>cdef np.ndarray x = np.array( [ 1.0, 2.0, 3.0, 4.0 ] )<br />
cdef np.ndarray y = np.array( [ 7.0, 8.0, 9.0, 0.0 ] )</p>
<p>print tokyo.dger( x, y )<br />
</code><br />
You see I use the BLAS names for functions, here &#8220;dger&#8221; is short for &#8220;double precision general rank-1&#8243;.  BLAS naming seems confusing to start with but after a little study it&#8217;s actually quite simple.  As per usual in Cython, giving x and y a cdef type isn&#8217;t necessary, but it allows Cython to produce faster C code.</p>
<p>As I suspect that quite a few other people have run into this matrix speed problem I figure it&#8217;s worth sharing the code.  Finally, thanks to Dag Sverre Seljebotn&#8217;s Scipy 2009 tutorial which got me started on writing this.</p>
<p>Get Tokyo here:  <a href="http://www.vetta.org/software/tokyo_v0.3.tgz">www.vetta.org/software/tokyo_v0.3.tgz</a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.vetta.org/2009/09/tokyo-a-cython-blas-wrapper-for-fast-matrix-math/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>SciPy &#8211; some more thoughts</title>
		<link>http://www.vetta.org/2008/05/scipy-some-more-thoughts/</link>
		<comments>http://www.vetta.org/2008/05/scipy-some-more-thoughts/#comments</comments>
		<pubDate>Fri, 09 May 2008 20:39:30 +0000</pubDate>
		<dc:creator>Shane Legg</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[Python]]></category>
		<category><![CDATA[Scipy]]></category>

		<guid isPermaLink="false">http://www.vetta.org/?p=53</guid>
		<description><![CDATA[With over 20,000 visitors in 2 days I discovered what my poor web server&#8217;s maximum capacity is: about 70 visitors a minute.Â  It doesn&#8217;t sound like that much to me, but a number of times I had to ask my &#8230; <a href="http://www.vetta.org/2008/05/scipy-some-more-thoughts/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>With over 20,000 visitors in 2 days I discovered what my poor web server&#8217;s maximum capacity is: about 70 visitors a minute.Â  It doesn&#8217;t sound like that much to me, but a number of times I had to ask my ISP to resuscitate the thing.Â  I guess all the interpreted PHP code and database access required to build each page is quite computationally expensive. Â Â  Anyway, I thought I&#8217;d write a short addendum for anybody who is still interested.</p>
<p>First of all, things like SciPy are obviously good tools for certain types of problems.  If you&#8217;re writing a front end system for a corporate database system or a text editor â€” forget about it.Â  That said, plain Python is a pretty good way to code most things in terms of being a language that is somehow both concise and simple.Â  In my opinion the only real downside is speed: it&#8217;s about 100 times slower than C, or around 50 times slower than java.Â  Anyway, that&#8217;s another topic, here I&#8217;m more focused on using Python in combination with SciPy.Â  If you&#8217;re dealing with complex transformations on data that has many dimensions, for example simulating networks or machine learning, SciPy is likely to be good tool for the job.</p>
<p>One of the most popular comments on reddit basically said that while writing a program to use matrices can produce very short code, it&#8217;s far too difficult to debug.  I don&#8217;t buy this.  Firstly, in my experience, if you get the matrix computations wrong the output is normally complete junk.  That&#8217;s good news, because the really hard bugs to fix are ones where the output is almost right, or is right most of the time.  You also can be pretty confident that all the fancy algorithms doing efficient matrix multiplications etc. are correct.  Furthermore, because the code is so short there are only a few places where you have to look to work out what you&#8217;re doing wrong.  Finally, if you&#8217;re working in areas such as machine learning or finance then the normal way to describe things is already matrix notation.  If you want to, say, construct a non-parametric density estimate by using a multivariate Gaussian kernel, then your starting point is going to be a set of matrix equations.</p>
<p>What are the problems with SciPy?Â  As pointed out by one of the commenters to my last post, one major problem is documentation.Â  This is common with open source projects it seems: people like to write code but they don&#8217;t like to write documentation.Â  The basics seem to be covered ok, but as you get further into the libraries it&#8217;s not so good.Â  And it&#8217;s not just that the documentation is missing, but rather the structure of the whole SciPy system is confusing.Â  For example, let&#8217;s say you want to interpolate in multiple dimensions.Â  You go to the interpolate module and what you find is that it only does up to 2 dimensions.Â  So you&#8217;re out of luck, right?Â  No, the trick is to look in the image processing library ndarray which has interpolation is as many dimensions as you want.Â  For a new user this is all really confusing.Â  Hopefully, as the whole system matures these things will improve.Â  As the documentation is wiki style, once I&#8217;m a bit more knowledgable I&#8217;m going to do my part and help out in a few places.Â  It&#8217;s really the least I can do.</p>
<p>Then there is the question of performance.Â  I&#8217;ve been told that the standard Ubuntu SciPy package doesn&#8217;t use matrix libraries that are all that efficient.Â  If you want to have fast SciPy then you need to install BLAS, LAPACK and ATLAS yourself.Â  Using ATLAS you can then compile the libraries so that they will be automatically optimised for your particular CPU, memory cache sizes etc.Â  You can then get SciPy to use these optimised libraries.Â  Just look under the installing SciPy for all the instructions.Â  I haven&#8217;t done this yet but people who have report significant speed increases.</p>
<p>If your code makes relatively few calls (say tens of thousands or less) on relatively big matrices (say with thousands of elements) then (depending on what calls you&#8217;re making) your code will probably spend most of its time busy in the matrix library.Â  With an optimised library you&#8217;re then doing pretty well in terms of performance.Â  You might want to look at things like Parallel Python to take the performance to the next level.</p>
<p>Where performance becomes a problem is when you have the reverse situation: many calls (millions) on relatively small matrices (hundreds of elements or less).Â  When what tends to happen then is that all the computation time is spent going from Python into the library and back out again.  The time spent actually inside the library doing computations is negligible.  In this case your code would be much faster if it was written in C++ and directly called the BLAS libraries.  Other than trying to rethink your algorithm so that each matrix operation does a bigger chunk of the total work, I&#8217;m not sure what to do about this.  Using something like Pyrex doesn&#8217;t really help all that much (Pyrex takes Python with a few extras and compiles it into C that can then be compiled into a binary library that Python can use).  The problem is that even in Pyrex code you still have the overhead of going into and back out of the SciPy library.  You can directly access matrices (ndarrays) from within Pyrex in a way which is very efficient, however you&#8217;re then left writing the matrix code yourself â€” ok if you want to do something trivial, but not a good idea if you need to efficiently multiply two matrices.Â  So in short, if you need to make a large number of calls on relatively small matrices then I&#8217;m not sure how you can do this in a way that can compare to C++ used with BLAS.Â  If I ever find a solution, I&#8217;ll let you know.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.vetta.org/2008/05/scipy-some-more-thoughts/feed/</wfw:commentRss>
		<slash:comments>7</slash:comments>
		</item>
		<item>
		<title>SciPy &#8211; the embarrassing way to code</title>
		<link>http://www.vetta.org/2008/05/scipy-the-embarrassing-way-to-code/</link>
		<comments>http://www.vetta.org/2008/05/scipy-the-embarrassing-way-to-code/#comments</comments>
		<pubDate>Tue, 06 May 2008 10:31:45 +0000</pubDate>
		<dc:creator>Shane Legg</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[Efficiency]]></category>
		<category><![CDATA[Finance]]></category>
		<category><![CDATA[Python]]></category>
		<category><![CDATA[Scipy]]></category>

		<guid isPermaLink="false">http://www.vetta.org/?p=52</guid>
		<description><![CDATA[I&#8217;ve programmed in many languages before, indeed I&#8217;ve spent at least a year working in Basic, C, C++, C#, java, assembler, modula-2, powerhouse and prolog.  One thing I&#8217;ve never done before is Matlab, well except a few basic exercises for &#8230; <a href="http://www.vetta.org/2008/05/scipy-the-embarrassing-way-to-code/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>I&#8217;ve programmed in many languages before, indeed I&#8217;ve spent at least a year working in Basic, C, C++, C#, java, assembler, modula-2, powerhouse and prolog.  One thing I&#8217;ve never done before is Matlab, well except a few basic exercises for some course I did way back.  A couple of years ago I started using python and more recently I&#8217;ve started to use the scipy libraries which essentially provide something similar to Matlab.  The experience has been unlike anything I&#8217;ve coded in before.  The development cycle has gone like this:</p>
<p>1) Write the code in python like I would write it in, say, java.  I have data stored in some places, then I have algorithms that iterate over these data structures computing stuff, calling methods, changing values and doing various complex things in order to implement the desired algorithm.  10 pages of code, somewhat general.</p>
<p>2) Then I realise that in a few places I don&#8217;t need to iterate over something, I can just use some vectors and work with those directly.  7 pages of code, a little more general.</p>
<p>3) Then I realise that part of my code is really just running an optimisation algorithm, so I can replace it with a call to an optimiser in one of the scipy libraries.  5 pages of code, and a bit faster now.</p>
<p>4) Then I try to further generalise my system and in the process I realise that really what I&#8217;m doing is taking a Cartesian space, building a multi-dimensional matrix and then applying some kind of optimiser to the space.  3 pages of code, very general.</p>
<p>5) Finally I&#8217;m like, hey, how far can I push this?  With some more thought and spending a few days trying to get my head around all the powerful scipy libraries, I finally figure out that the core of my entire algorithm can be implemented in an extremely general and yet fast way in just a few lines.  It&#8217;s really just a matrix with some flexible number of dimensions to which I am applying some kind of n-dimensional filter, followed by an n-dimensional non-linear optimiser on top of an n-dimensional interpolation and finally coordinate mapping back out of the space to produce the end results.  2 pages of code, of which half is comments, over a quarter is trivial supporting stuff like creating the necessary matrices, and just a few lines make the necessary calls to implement the algorithm.  And it&#8217;s all super general.</p>
<p>Now this is great in a sense.  You end up throwing away most of your code now that all the real computation work is being done by sophisticated mathematical functions which are using optimised matrix computation libraries.  The bottleneck in writing code isn&#8217;t in the writing of the code, it&#8217;s in understanding and conceptualising what needs to be done.  Once you&#8217;ve done that, i.e. come up with mathematical objects and equations that describe your algorithm, you simply express these in a few lines of scipy and hit go.</p>
<p>It&#8217;s not just with my financial software either.  I recently implemented a certain kind of neural network using nothing but scipy and found that the core of the algorithm was just one line of code &#8212; a few matrix transformations and calls to scipy functions.  I hear that one of the IDSIA guys working on playing Go recently collapsed the code he&#8217;s been working on for six months down to two pages.</p>
<p>The downside to all this is that you spend months developing your complex algorithms and when you&#8217;re done you show somebody the result of all your efforts &#8212; a page or two of code.  It looks like something that somebody could have written in an afternoon.  Even worse, <em>you</em> start to suspect that if you had really known scipy and spent a few days carefully thinking about the problem to start with, then you probably <em>could</em> have coded it in an afternoon.  It&#8217;s a little embarrassing.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.vetta.org/2008/05/scipy-the-embarrassing-way-to-code/feed/</wfw:commentRss>
		<slash:comments>52</slash:comments>
		</item>
	</channel>
</rss>

