Comparing C++ And Perl

Comparing C++ And Perl

Warning: this information reflects my first attempt at using Connected Text (a personal wiki system) to capture useful information, and post it to the web. The information here is not particularly polished at present.

Table of Contents

The problem

We have a file that has many records that look like the following:

@@authors

C. Gentry

@@title

Key Recovery and Message Attack on NTRU-Composite

@@title_link=http://www.iacr.org/archive/eurocrypt2001/20450181.pdf

@@conference=EUROCRYPT @@year=2001 @@pages=182-194

@@publisher=Springer-Verlag @@series=LNCS @@vol=2045

@@location=Innsbruck, Austria

@@technology=ntruencrypt

@@description

The paper presents a clever attack that would work against NTRUEncrypt if
NTRUEncrypt were ever deployed with the security parameter N not a prime
number. In practice, NTRU Cryptosystems has strongly recommended that N
always be prime. Indeed, all commercial implementations conform to NTRU's
recommendations, and are immune to this attack. 

@@abstract

NTRU is a fast public key cryptosystem presented in 1996 by Hoffstein,
Pipher and Silverman of Brown University. It operates in the ring of
polynomials $Z[X]/(X^N-1)$, where the domain parameter $N$ largely
determines the security of the system. Although $N$ is typically
chosen to be prime, Silverman proposes taking $N$ to be a power of two
to enable the use of Fast Fourier Transforms. We break this scheme for
the specified parameters by reducing lattices of manageably small
dimension to recover partial information about the private key. We
then use this partial information to recover partial information about
the message or to recover the private key in its entirety.

@@-----------------------------------------------------------------------

We want to parse this file, and transform the format in to html (or some other output format).

The Code

The general idea is to split the data on each @@. An initial = sign is ignored (it just helps readability when multiple @@'s occur on the same line). If we see a new @@-------, then we go on to a new bibliography item.

A Solution in Perl

The C++ Architecture

Perl Features

Perl Feature Example
Regular Expressions if ($i =~ /^([\w-]+)\s*=?\s*/)
Sorting sort { my_sort($a, $b); } @entries
Hashes my %headings = ();
Arrays my @technologies = ( "ntruencrypt", ... );
References append($', $current_tag, \%current);
Dereferences ${$current_ref}{$current_tag} .= $i;
String functions print "-"x40;

C++ Features

The C++ features we used (to some degree) in our implementation were:

Text File Conventions

There are certain conventions that go on in C++ implementations, for example the spacing of code (I use tabbing set to 4 spaces (no real tabs), and indenting K&R style).

Also I guess things like the use of #ifndef inside the header files, e.g. Bibitem.h is a convention. And by convention each header file should only #include the necessary header files for its compilation, and such inclusions can be reduced by using forward class declarations. Similarly cpp files should only #include the necessary header files for their compilation (but obviously forward class declarations aren't useful here).

Namespaces

Namespaces should be declared in header files, but never used in header files, i.e. the header files should be explicit about which classes they are using, e.g. see a namespace being declared in Parser.h and the std namespace being used in parse.cpp, and not being used in Bibitem.h, i.e. the STL string class is explicitly written as std::string .

Boost

Boost has many useful classes. In this project it was used for its handling of regular expressions.

An example of

An example of

An example of

Error Handling

#defines

  headers
  cpp

Inheritance

  derived

Keywords

  static
  const

STL

  vector
  map
  ostream
  sort

A Comparison of Perl and C++

Since we went to the bother of writing two implementations of essentially the same idea, it is taking the small amount of time to compare the two. Useful metrics might be:

Development time

I would have to say the C++ took about twice as long to develop as the Perl, even with a working Perl program available first (the seperate files, and more complicated process just took time to get right).

Running time

C++ wins on running time, but actual running time is very small in both cases (since the number of bibliography items is so small).

Debugging time

C++ is more robust code, so is far easier to maintain and debug.

Scaling

I'd like to extend the C++ code much more than the Perl code. However again it will be slightly slow to develop.

Conclusion

Perl is a great prototyping language, and for small one-off tasks (e.g. filtering) it is almost certainly better than C++. If the project is expected to expand, then C++ will become preferable. Obviously you only want to work with one version going forward, since maintaining two independent pieces of code is a pain.


Wednesday, April 29, 2009 (23)