perl: June 2009 Archives

The Perl Foundation GSoC2009 Roundup

| | TrackBacks (0)
So much has been going on this summer of code! As a recap, TPF got nine student slots this year, which means we have nine mentors and nine students working on various things this summer. Here is a sample of what has been going on recently.

Pascal Gaudette describes his love for debugging tricky HTTP/1.1 issues in Mojo
and has even added a "featurette."  Devin Austin has been talking about
eviscerating Catalyst::Helper and Daniel Arbelo Arrocha has thoughtfully
detailed the difference between bonding and bondage. My student, Robert Kuo,
has been busy reading the mathematical paper and example C implementation  of
the Strong Lucas Pseudoprime primality test for Math::Primality . While installing Math::GMPz, which we use to access the GNU Multiprecision Library (GMP), he found a small issue which caused some test failures and submitted a bug report.

Math::Primality also very recently gained a working is_prime() method, which
works for arbitrary sized integers, due mostly to Robert Kuo's implementation
of is_strong_lucas_psuedoprime() being finished. Now you can test for prime numbers in Perl without installing Math::Pari! More about this in a separate post!

Ryan Jendoubi is working on a Perl interface for wxWebkit and Hinrik Örn Sigurðsson
is working on the command line utility to read Perl 6 documentation called grok.
Justin Hunter has been hacking on his blogging software so that he can blog about
his work on SQL::Translator. Sometimes it's a vicious cycle...

Back to hacking on some code!


To use less memory! You may think that

my $x = 'foo'; my $y = $x;

would store the string 'foo' in one place, and then when $y gets changed, use the new string. It doesn't. To see this, we can use Devel::Peek from the perl debugger. We start it up with

perl -de0

which I have aliased to p in vim, because yes, I am that lazy and I loves me some debugger. Once we are in the debugger you have to use the module. Note that the debugger output is bolded.

 DB<1> use Devel::Peek

Then define a variable

 DB<2> $x = 'foo'

PROTIP: You can't use my $x in the interactive debugger because each line of input to the debugger gets wrapped in it's own lexical scope, a lexical variable (i.e one defined with my) will not exist outside of this inner scope.

Now we use the "x" debugger command, which prints out the result of executing some Perl code. We give it the code Dump $x, which calls Devel::Peek's Dump method on $x, which prints out low-level information about the internal properites of a variable.

 DB<3> x Dump $x
SV = PV(0x9542ca4) at 0x9565924
 REFCNT = 1
 FLAGS = (POK,pPOK)
 PV = 0x95731b8 "foo"\0
 CUR = 3
 LEN = 4
 empty array


SV is Perl-internals-speak for "scalar value" and the PV means that it is a string. The reference count is 1, which means only one thing, itself, points to itself. When a variable goes out of scope, the REFCNT goes to 0 and the Perl garbage collector recycles it. The FLAGS containing POK tells us that is a valid string. Take note that the PV line shows us the memory address that the string is stored at, as well as it's value and that it is null-terminated.

Now let's set a new variable equal to $x.

 DB<4> $y = $x

Now let's take a look at it:

 DB<5> x Dump $y
SV = PV(0x9542c2c) at 0x958ab70
 REFCNT = 1
 FLAGS = (POK,pPOK)
 PV = 0x958b968 "foo"\0
 CUR = 3
 LEN = 4
 empty array

The PV line is the most interesting. If you compare it to above, you will see that it is a different memory address! Not just the overhead of each Perl variable is being stored, but the string 'foo' is being stored in two different places!

Now imagine that our string is a few hundred megabytes, and you are manipulating it. You are really going to notice having a few extra copies of it around, aren't you! If you know that you have a lot of data structures that have the same string in them, use a reference to save memory. If you are running up against "perl Out of memory!", this could be a trick that gets you out of that bind without buying more RAM.

Let's check out the memory savings with Devel::Size, a nice module which can tell you how much memory your Perl variables are using.
 
DB<1> use Devel::Size qw/size total_size/              

Next we create a longish string.

DB<2> $x = 'i like eating memory' x 1000                                

We can use the size() function to see how much memory it is using:

DB<3> x size($x)                                                                                           
0  20036

Now let us dump a hash which has this string repeated as two values. We use the total_size() function, which follows references:

DB<4> x total_size( { a => "foo", b => $x, c => $x } )                             
0  40253

Now we do the same, but we store references to the strings:

DB<5> x total_size( { a => "foo", b => \$x, c => \$x } )                    
0  20249

Roughly half the size!

And that is why you would use references to Perl strings. Why else would you?

About this Archive

This page is a archive of entries in the perl category from June 2009.

perl: May 2009 is the previous archive.

perl: September 2009 is the next archive.

Find recent content on the main index or look in the archives to find all content.

Clicky Web Analytics 42