Friday, December 11, 2009

Journeys through PHP

Over the years I've used a bit of server side languages: C, Perl, C Shell; I've even wrote my own web server for kicks once. Sounds very 1997, right?

Back to 2009...

So I had to make this website for this company, a big idea company, from scratch. I was a systems programmer who wanted to become familiar with web stuff.

The feasible options in my subjective mind were:

Java

Java was a pain to use in college, and I never really understood Tomcat. Also, I just personally feel like Java requires more thoughtful design and more work. I hate doing both those things.

Scala

Too young still. No really, I don't believe it has proven itself yet. I'm sure it's full of crazy bugs. Would you consider twitter stable? It uses Scala.

Perl

Too old. No I mean ... this would have been ok, I guess. Slashdot is probably still in Perl ... as is, I dunno. It's a good language - probably wouldn't have been a bad choice ... oh well. So much for that.

Ruby

I don't know what is up with this language. People love it one day then hate it the next. I don't think I've heard of anyone at like, the big 50 sites talking about it; so until that happens, until I see an article entitled something like
How we reduced latency and increased load on ebay general search with a distributed Ruby based approach
Until that day, it's really not worth looking into.

Python

Oh a language where print had to be redesigned to be more 'pythonic'. These people actually get bent out of shape on some imaginary philosophy that constitute "Good Programming Language Design". They are like the Libertarians of the programming world. Maybe there is some manifesto nobody has showed me or some magazine I'm not subscribed to.

It's a fine language, but when basic things require the runaround (like a switch statement), I'm told, "But this; this is 'proper'"; the language purists argue, referencing some invisible playbook from which all their assertions come.

I have to give it some credit however; for when you want to do non-trivial advanced things Python handles them almost as well as Perl. Only I guess using Python instead of Perl will make me more friends amongst the wine sipping, dining philosophers of the programming world.

PHP

Ah, the 'bottom rung', 'used dishrag', 'tragically maldesigned' accident of computer science. Why would anyone ever engage themselves in such a huge waste of time?

Well,
  • It's easy
  • It's usually fast
  • It's relatively bug free
  • It's philosophy-free
  • It can do almost anything that takes less then a page of math to explain
  • It integrates nicely with webpages
That sounds like a decent language. But for some reason, it's totally uncool.



Maybe because it's such an easy choice, the large amount of bad code from amateur programmers in PHP has given the language itself a bad reputation --- you know, like what happened with Java. Sorry Sun. :-(

Some of the distaste for PHP however, is well-founded.

The Bad Parts of PHP

Ok, so PHP is a fine language. There. I said it. Go unfriend me on facebook if you want. But really, it grabs things from a database and can easily emit html. That usually covers it.

The truth is, most of the time, if you are doing webpages, you can get by with any language that supports the following:
  • Conditionals (if/then/else)
  • Variables that vary (haha - take that Haskell)
  • Data structures (array)
  • Data resource connections (like pg_query)
  • Easy to use I/O routines (sorry Erlang)
  • Iterators (for, foreach, while, do)
And that's it. You can totally make a huge ass website if you just have those things in your language. This isn't asking much, and when you don't ask for much, PHP delivers!



Alas, their are some problems:

Concurrency Support

Concurrency isn't something that is an after-thought. You can't just slap it on to a language like you can with SHA1 support. Well, you can, but then you get a mess. Ideally, you need to have a language that was carefully thought out (read Python) or something designed for this mess (read Erlang).

Overhead

Object memory overhead in PHP is obscene. Just ridiculous. Most web objects are small, like on the order of kilobytes --- which PHP happily translates into megabytes. But if you want to do some Data Warehousing applications and you have a huge framework already in PHP, then PHP for the DW sounds like a reasonable choice. Oh, my friend, be prepared for a surprise.
  • PHP's hash tables don't scale well. Aggregated O(1) my ass --- yeah if 1 means 1 second.
  • PHP's memory limit is a 32 bit signed int --- that wraps around and does some ABS or something on itself --- so if you make it like 2049M then you have a 1MB limit. You can try it yourself if you like
  • Object oriented-ness was tacked on ... once again, if it's going to be there, then it has to be a core language design feature, not an after-thought. When you try to retrofit a traditional language into an object oriented syntax, you usually end up with something much more complex then need be, i.e. C++.
As a result, doing anything more then C-style software engineering in PHP is just going to lead to bad news. In PHP, you need to embrace globals, carefully name your variables, and sort your code like it's a C program from back in the day. Then when you hit the scaling point of multiple databases spanning across multiple servers you realize a problem:

No serious backend software is in PHP

Why does this matter? Well let's look at the DW again. If you are using a Java based system like say, Solr (Lucene), HBase (Hadoop), or Cassandra, then you have two options:
  1. Use a restful API, or in the case of Cassandra, Thrift.
  2. Program in Java
The problem with number 1, is that unless you have scaled your application over sooo many systems, so poorly, that your data is far away from your processing, then you are duplicating data in memory. Sure it's fast, but it's also stupid. What you are saying is this:
As a matter of convenience and stubbornness and my decision to program in language X when using a library in language Y, I will now translate all the data from language Y to X through some crazy interface before I do my mapping and reducing steps.
And that, my dear, is horribly irresponsible.

Despite all this, I'm guilty of all the above and use PHP extensively. :-p

No comments:

Post a Comment

Followers