London Perl Mongers Teach-In

Dave Cross <dave@mag-sol.com>

Magnum Solutions Ltd

Session 2

  • ORM

  • Testing

  • Benchmarking

Object Relational Mapping

Object Relational Mapping

  • Mapping database relations into objects

  • Tables (relations) map onto classes

  • Rows (tuples) map onto objects

  • Columns (attributes) map onto attributes

  • Don't write SQL

SQL is Tedious

  • Select the id and name from this table

  • Select all the details of this row

  • Select something about related tables

  • Update this row with these values

  • Insert a new record with these values

  • Delete this record

Replacing SQL

  • Instead of

      SELECT *
      FROM   my_table
      WHERE  my_id = 10
  • and then dealing with the prepare/execute/fetch code

Replacing SQL

  • We can write

      use My::Object;
      # warning! not a real orm
      my $obj = My::Object->retrieve(10)
  • Or something similar

Writing an ORM Layer

  • Not actually that hard to do yourself

  • Each class needs an associated table

  • Each class needs a list of columns

  • Create simple SQL for basic CRUD operations

  • Don't do that

Perl ORM Options

  • Plenty of choices on CPAN

  • Tangram

  • SPOPS (Simple Perl Object Persistence with Security)

  • Alzabo

  • Class::DBI

  • DBIx::Class

    • The current favourite

DBIx::Class

  • Standing on the shoulders of giants

  • Learning from problems in Class::DBI

  • More flexible

  • More powerful

DBIx::Class Example

  • Simple example of using DBIx::Class

  • Modeling a CD collection

  • Three tables

    • artist (artistid, name)

    • cd (cdid, artistid, title)

    • track (trackid, cd, title)

Main Schema

  • Define main schema class

  • MyDatabase/Main.pm

      package MyDatabase::Main;
      use base qw/DBIx::Class::Schema/;
      __PACKAGE__->load_classes(qw/Artist Cd Track/);
      1;

Define individual object classes

  • MyDatabase/Main/Artist.pm

      package MyDatabase::Main::Artist;
      use base qw/DBIx::Class/;
      __PACKAGE__->load_components(qw/PK::Auto Core/);
      __PACKAGE__->table('artist');
      __PACKAGE__->add_columns(qw/ artistid name /);
      __PACKAGE__->set_primary_key('artistid');
      __PACKAGE__->has_many('cds' => 'MyDatabase::Main::Cd');
      1;

Inserting Artists

  • Connect to database

      my $schema = MyDatabase::Main->connect($dbi_str);
  • List of artists

      my @artists = ('The Beta Band',
                     'Beth Orton');
  • Use populate

      $schema->populate('Artist',
                        [ [ 'name' ], @artists ]);

Inserting CDs

  • Hash of Artists and CDs

      my %cds = ( 'The Three EPs' => 'The Beta Band',
                  'Trailer Park'  => 'Beth Orton');
  • Find each artist and insert CD

      foreach (keys $cds) {
        my $artist = $schema->resultset('Artist')->search(
                        { name => $cds{$_} }
                     );
        $schema->populate('Cd',
                          [ qw(title artist),
                            [ $_, $artist ] ] );
      }

Searching for Data

  • Get CDs by artist

      my $rs = $schema->resultset('Cd')->search(
                  { 'artist.name' => $artistname },
                  { join     => [qw/ artist /],
                    prefetch => [qw/ artist /] } );
      while (my $cd = $rs->next) {
        print $cd->title, "\n";
      }

Don't Repeat Yourself

  • There's a problem with this approach

  • Information is repeated

  • Columns and relationships defined in the database schema

  • Columns and relationships defined in class definitions

Repeated Information

  •   CREATE TABLE artist (
        artistid INTEGER PRIMARY KEY,
        name TEXT NOT NULL 
      );
  •   package MyDatabase::Main::Artist;
      use base qw/DBIx::Class/;
      __PACKAGE__->load_components(qw/PK::Auto Core/);
      __PACKAGE__->table('artist');
      __PACKAGE__->add_columns(qw/ artistid name /);
      __PACKAGE__->set_primary_key('artistid');
      __PACKAGE__->has_many('cds' => 'MyDatabase::Main::Cd');
      1;
  • This is bad

Database Metadata (An Aside)

  • Some people don't put enough metadata in their databases

  • Just tables and columns

  • No relationships. No constraints

  • You may as well make each column VARCHAR(255)

  • Describe your data in your database

  • It's what your database is for

  • It's what your database does best

No Metadata (Excuse 1)

  • "This is the only application that will ever access this database"

  • Bollocks

  • All data will be shared eventually

  • People will update your database using other applications

  • Can you guarantee that someone won't use mysql to update your database?

No Metadata (Excuse 2)

  • "Our database doesn't support those features"

  • Bollocks

  • MySQL 3.x is not a database

    • It's a set of data files with a vaguely SQL-like query syntax

  • MySQL 4.x is a lot better

  • MySQL 5.x is most of the way there

  • Don't be constrained by using inferior tools

DBIx::Class::Schema::Loader

  • Creates classes by querying your database metadata

  • No more repeated data

  • We are now DRY

  • Schema definitions in one place

  • But...

  • Performance problems

Performance Problems

  • You don't really want to generate all your class definitions each time your program is run

  • Need to generate the classes in advance

  • dump_to_dir method

Conclusions

  • ORM is a bridge between relational objects and program objects

  • Avoid writing SQL in common cases

  • DBIx::Class is the currently fashionable module

  • Lots of plugins

  • Caveat: ORM may be overkill for simple programs

More Information

  • Manual pages (on CPAN)

  • DBIx::Class

  • DBIx::Class::Manual::*

  • DBIx::Class::Schema::Loader

  • Mailing list (Google for it)

Testing

Testing

  • Never program without a safety net

  • How do you know if your code does what it is supposed to do

  • How do you know if your code continues to do what it is supposed to do

  • Write unit tests

  • Run those tests all the time

When To Run Tests

  • As often as possible

  • Before you add a feature

  • After you have added a feature

  • Before checking in code

  • Before releasing code

  • Constantly, automatically

Testing in Perl

  • Perl makes it easy to write test suites

  • A lot of work in this area over the last five years

  • Test::Simple and Test::More included in Perl distribution

  • Many more testing modules on CPAN

Simple Test

  use Test::More tests => 4;
  BEGIN { use_ok('My::Object'); }
  ok(my $obj = My::Object->new);
  isa_ok($obj, 'My::Object');
  $obj->set_foo('Foo');
  is($obj->get_foo, 'Foo');

Running a Simple Test

  $ prove -v foo.t 
  foo....1..4
  ok 1 - use My::Object;
  ok 2
  ok 3 - The object isa My::Object
  ok 4
  ok
  All tests successful.
  Files=1, Tests=4,  1 wallclock secs ( 0.03 cusr +  0.01 csys =  0.04 CPU)

Adding Test Names

  use Test::More tests => 4;
  BEGIN { use_ok('My::Object'); }
  ok(my $obj = My::Object->new, 'Got an object');
  isa_ok($obj, 'My::Object');
  $obj->set_foo('Foo');
  is($obj->get_foo, 'Foo', 'The foo is "Foo"');

Running a Simple Test

  $ prove -v foo2.t 
  foo2....1..4
  ok 1 - use My::Object;
  ok 2 - Got an object
  ok 3 - The object isa My::Object
  ok 4 - The foo is "Foo"
  ok
  All tests successful.
  Files=1, Tests=4,  0 wallclock secs ( 0.03 cusr +  0.00 csys =  0.03 CPU)

Using prove

  • prove is a command line tool for running tests

  • Runs given tests using Test::Harness

  • Comes with the Perl distribution

  • Command line options

    • -v verbose output

    • -r recurse

    • -s shuffle tests

    • Many more

Test Anything Protocol

  • Perl tests have been spitting out ok 1 and not ok 2 for years

  • Now this ad-hoc format has a definition and a name

  • The Test Anything Protocol (TAP)

  • See Test::Harness::TAP (documentation module) and TAP::Parser

  • More possibilities for test output

    • TAP::Harness::Color

    • Test::TAP::HTMLMatrix

More Testing Modules

  • Dozens of testing modules on CPAN

  • Some of my favourites

  • Test::File

  • Test::Exception, Test::Warn

  • Test::Differences

  • Test::XML (includes Test::XML::XPath)

Writing Test Modules

  • These test modules all work together

  • Built using Test::Builder

  • Ensures that test modules all use the same framework

  • Use it as the basis of your own Test::* modules

  • Test your Test::Builder test modules with Test::Builder::Tester

    • Who tests the testers

Mocking Objects

  • Sometimes it's hard to test external interfaces

  • Fake them

  • Test::MockObject pretends to be other objects

  • Gives you complete control over what they return

Mock Object Example

  • You're writing code that monitors a nuclear reactor

  • It's important that your code reacts correctly when the reactor overheats

  • You don't have a reactor in the test enviroment

  • Even if you did, you wouldn't want to make it overheat every time you run the tests

    • Especially if you're not 100% sure of your code

    • Of if you're running unattended smoke tests

  • Fake it with a mock object

My::Monitor

  • If the temperature of a reactor is over 100 then try to cool it down

  • If you have tried cooling a reactor down 5 times and the temperature is still over 100 then return an error

  • Create a mock reactor object that acts exactly how we want it to

  • Reactor object has two methods

    • temperature - returns the current temperature

    • cooldown - cools reactor and returns success or failure

monitor.t

  use Test::More tests => 10;
  use Test::MockObject;
  BEGIN { use_ok('My::Monitor'); }
  ok(my $mon = My::Monitor->new);
  isa_ok($mon, 'My::Monitor');
  my $t = 10;
  my $reactor = Test::MockObject;
  $reactor->set_bound('temperature', \$t);
  $reactor->set_true('cooldown');
  ok($mon->check($reactor));
  $t = 120;
  ok($mon->check($reactor)) for 1 .. 5;
  ok(!$mon->check($reactor));

How Good Are Your Tests?

  • How much of your code is exercised by your tests?

  • Devel::Cover can help you to find out

  • Deep internal magic

  • Draws pretty charts

  •   HARNESS_PERL_SWITCHES=-MDevel::Cover make test
      cover

Devel::Cover Sample Output

Devel::Cover Sample Output

Devel::Cover Sample Output

Alternative Paradigms

  • Not everyone likes the Perl testing framework

  • Other frameworks are available

  • Test::Class

    • xUnit style framework

  • Test::FIT

More Info

  • Perl Testing: A Developer's Notebook (Ian Langworth & chromatic)

  • perldoc Test::Tutorial

  • perldoc Test::Simple

  • perldoc Test::More

  • perldoc Test::Builder

  • etc...

Benchmarking

Benchmarking

  • Ensure that your program is fast enough

  • But how fast is fast enough?

  • premature optimization is the root of all evil - Donald Knuth (paraphrasing Tony Hoare)

  • Don't optimise until you know what to optimise

Benchmark.pm

  • Standard Perl module for benchmarking

  • Simple usage

      use Benchmark;
      my %methods = (
        method1 => sub { ... },
        method2 => sub { ... },
      );
      timethese(10_000, \%methods);
  • Times 10,000 iterations of each method

Standard Benchmark.pm Output

  Benchmark: timing 10000 iterations of method1, method2...
     method1:  6 wallclock secs \
       ( 2.12 usr +  3.47 sys =  5.59 CPU) \
       @ 1788.91/s (n=10000)
     method2:  3 wallclock secs \
       ( 0.85 usr +  1.70 sys =  2.55 CPU) \
       @ 3921.57/s (n=10000)

Benchmarking For A Given Time

  • Passing timethese a positive number runs each piece of code a certain number of times

  • Passing timethese a negative number runs each piece of code for a certain number of seconds

  •   use Benchmark;
      my %methods = (
        method1 => sub { ... },
        method2 => sub { ... },
      );
      # Run for 10,000(!) seconds
      timethese(-10_000, \%methods); 

Compare Performance

  • Use cmpthese to get a tabular output

  • Optional export

  •   use Benchmark 'cmpthese';
      my %methods = (
        method1 => sub { ... },
        method2 => sub { ... },
      );
      cmpthese(10_000, \%methods);

cmpthese Output

  •                  Rate method1 method2
        method1 2831802/s      --    -61%
        method2 7208959/s    155%      -- 
  • method2 is 61% slower than method1

  • Can also pass negative number to cmpthese

Benchmarking Is Hard

  • Very easy to produce lots of numbers

  • Ensure that the numbers are meaningful

  • Compare code fragments that do the same thing

Bad Benchmarking

  use Benchmark qw{ timethese };
  timethese( 1_000, {
    Ordinary    =>
        sub {
          my @results = sort { -M $a <=> -M $b }
                             glob "/bin/*";
        },
    Schwartzian =>
        sub {
          map $_->[0],
          sort { $a->[1] <=> $b->[1] }
          map [$_, -M], glob "/bin/*";
        },
    });

What To Benchmark?

  • Profile your code

  • See which parts it is worth working on

  • Look for code that

    • Takes a long time to run, or

    • Is called many times

Devel::DProf

  • Devel::DProf is the standard Perl profiling tool

  • Included with Perl distribution

  • Uses Perl debugger hooks

  • perl -d:DProf your_program

  • Produces a data file called tmon.out

  • Command line program dprofpp to view results

Sample Output

  $ perl -d:DProf ./invoice.pl 244
  $ dprofpp 
  Total Elapsed Time = 1.882586 Seconds
    User+System Time = 0.772586 Seconds
  Exclusive Times
  %Time ExclSec CumulS #Calls sec/call Csec/c  Name
   9.06   0.070  0.070     95   0.0007 0.0007  Class::Accessor::new
   6.47   0.050  0.099      9   0.0055 0.0111  Template::Config::load
   5.18   0.040  0.040     10   0.0040 0.0040  Template::Parser::BEGIN
   5.18   0.040  0.039     40   0.0010 0.0010  Class::DBI::_fresh_init
   5.05   0.039  0.058    371   0.0001 0.0002  DateTime::Locale::_register
   3.88   0.030  0.202      4   0.0075 0.0505  DateTime::Format::MySQL::BEGIN
   3.88   0.030  0.164     19   0.0016 0.0086  DateTime::BEGIN
   3.88   0.030  0.029      6   0.0050 0.0049  Class::DBI::Loader::mysql::BEGIN
   3.75   0.029  0.117     40   0.0007 0.0029  base::import
   3.75   0.029  0.029    160   0.0002 0.0002  Lingua::EN::Inflect::_PL_noun
   2.98   0.023  0.022   1837   0.0000 0.0000  Class::DBI::Column::__ANON__
   2.59   0.020  0.124      7   0.0029 0.0178  DateTime::Locale::BEGIN
   2.59   0.020  0.040      4   0.0050 0.0099  Lingua::EN::Inflect::Number::BEGIN
   2.46   0.019  0.019    315   0.0001 0.0001  Params::Validate::_validate_pos
   2.46   0.019  0.077      1   0.0193 0.0770  DateTime::Locale::register

Conclusions

  • Don't optimise until you know you need to optimise

  • Don't optimise until you know what to optimise

  • Use profiling to find out what is worth optimising

  • Use benchmarking to compare different solutions

More Information

  • perldoc Benchmark

  • perldoc Devel::DProf

  • Chapters 5 and 6 of Mastering Perl