If you’ve ever tried building up a command programatically, you may have run into some issues around proper quoting of arguments.
To solve most problems, you can just throw everything inside of a set single quotes. The shell won’t touch anything inside and you’ll be all good.
$ echo "$hello world" # " world" # or something worse $ echo '$hello world' # "$hello world"
Then the one remaining question is what to do if you want to include a single quote? Your first inclination may be to escape it with a backslash:
$ echo '$hello \' world' # incomplete
But that’s not how it works :( The way you actually have to go about this is to stop the single-quote pair, escape a single quote, and then start it back up. Here it is:
$ echo '$hello '\'' world' # "hello ' world"
—
As a bonus, if you try to do this substitution in Ruby, you may try something like:
var = 'hello \' world'
var.gsub('\'', '\'\\\'\'') # "hello ' world' world"
Wat!? That’s not what you wanted! Read that documentation! Let’s try again
var.gsub('\'') { '\'\\\'\'' } # "hello '\\'' world"
There we go! Make it a little prettier even:
var.gsub("'") { %q{'\'} }
When you’re writing tests in Node, its often useful to be able to stub out modules which are included by the object you’re attempting to test. Unfortunately due to the way that the module system is structured, its not straightforward to do this in a single test (as opposed to globally).
There are a few solution for this dependency injection problem around, namely node-sandboxed-module and injectr. These modules use node’s vm module to create a new context that they run the test inside of, and in that context they use the mock instead of the original when calling `require`. This is a nice solution, but unfortunately by bringing the code into a new context, you also break code analysis tools like mocha’s html-cov.
A few days ago, Matt Morgan released a module called mockit, which approaches the problem instead by temporarily replacing `require` when requiring the dependency the test needs (as opposed to trying to replace it for the duration of the test).
Imagine you had a `Downloader` class, and you wanted to use it in your tests, but have it so that when `Downloader` called `require(‘http’);`, it got a mock instead of node’s http class. You could do that with one call to mockit:
var Downloader = mockit('../lib/downloader', {
http: mockHttp
});
I think this is a really nice solution to this problem – totally unobtrusive and uses node’s existing `require` functionality when no mock exists. Check it out!
Today I’m going to share a cool method for generating a JavaScript callstack from any point in your code. JavaScript doesn’t offer this functionality, but we can tease it out pretty easily, especially in V8 (very useful for Node programmers).
The closest thing JavaScript has to a callstack is the 10-frame backtrace that comes along whenever an error is generated. So we can use that to our advantage, by generating an error on-demand, catching the response, and examining the #stack attribute of the thrown error object:
function callstack() {
try { capture.error } catch (e) {
return e.stack;
}
}
Normally we’d have to stop there, and if we wanted to access that data programatically we’d have to parse the pre-formatted String that comes back. But with V8 we have a better option! V8 gives us access to the method that prepares the preformatted response, so by temporarily replacing it – we can return the original data instead. Check it out, here’s a small node.js module:
var replacementPreparer = function (error, trace) {
return trace;
};
module.exports = function () {
var capture;
var oldPreparer = Error.prepareStackTrace;
Error.prepareStackTrace = replacementPreparer;
try { capture.error } catch (e) {
capture = e.stack;
}
Error.prepareStackTrace = oldPreparer;
return capture;
};
And the usage:
var callstack = require('./path/to/callstack/js');
for (var i = 0; i < 1000; i++) {
callstack(); // use this anywhere in your code
}
When you call `callstack`, you’ll get back an array of CallSite objects upon which you can call methods from the V8 API to get at all the juicy details: http://code.google.com/p/v8/wiki/JavaScriptStackTraceApi
I’ve touched on this gotcha briefly in the past when discussing the Wat video, but I thought a few examples of when single-line conditionals can bite you would be fun.
In Ruby, we can write a conditional containing a single expression that normally takes up three lines:
unless condition something end
on a single line to save space:
something unless condition
And that is all well and good – these two are pretty much the same. But they’re not identical in practice. There are a few weird things about to come up.
Our first example will be using defined? to conditionally print a variable
if defined?(var1) puts var1 # never runs end var1 # NameError: undefined local variable
So that works just as we’d expect. The conditional never runs because defined?(var1) returns nil. After the conditional, access the undefined var1 gives us a NameError because (appropriately) its not defined. Let’s modify that a little bit and put an assignment inside of the conditional.
if defined?(var2) var2 = 5 # never runs end var2 # nil defined?(var2) # "local-variable"
So that might look a bit odd. We never ran the conditional, so var2 never gets set – that makes sense. But after the block, var2 doesn’t throw a NameError when accessed anymore. This is because the Ruby parser makes room and defines var2 when it sees it on the lefthand side of an expression (even though its inside of a conditional that doesn’t run).
Let’s write the same on one line though:
var3 = 5 if defined?(var3) var3 # 5
Even more interesting – an undefined variable written this way will become defined and assigned when run. The first thing that happens is that the parser comes along and defines var3 which it sees on the lefthand side of a conditional. Then defined? runs, which this time evaluates to "local-variable", causing the conditional to pass, and 5 to be assigned to var3. In cases like this, the single-line conditional will produce an entirely different result than block conditionals.
Yesterday I had lunch with Paul Dix, and he mentioned a serialization library called MessagePack that could serialize and deserialize data the same as JSON, but with a much smaller footprint in to the way that it encodes it data. So I started digging in..
JSON stores data in sets of enclosing braces that are extremely readable, but not really optimal computer to read. MessagePack gets its optimizations by putting all of the data needed to know the size of a data structure at the front of it.
So JSON thinks like:
[ # square brace - i'm about to read an array "one", # read until the end ", then saw a comma - there must be more elements "two" # read until the end ", no comma ] # square brance - i'm done reading an array
And MessagePack thinks like:
\x92 # a dictionary is coming with 2 elements \xA3 # the first element is a string with 3 letters one # 3-letter string \xA3 # the second element is a string with 3 letters two # 3-letter string
So that’s really cool – and there are implementation for a lot of languages. But I couldn’t stop. How much smaller does MessagePack makes things practically – and how much faster is it?
I figured I’d compare a few common structures with the two libraries and see how things came out. I was also very interested to see how each library’s output compressed with gzip.
So I started off with a Facebook profile – Its a large Dictionary/Array mix of data with generous amounts of integers and strings mixed in. A great test subject for a common case.
== Full facebook profile ==== MessagePack length: 1990* (973 compressed - 49%) JSON length: 2441 (955* compressed - 39%)
So on *this* data, although MessagePack is smaller, it didn’t compress by the same amount and thus the JSON compressed version was smaller. This is extremely interesting, but not entirely surprising if you think about the amount of information that’s repeated over and over in JSON (specifically common patterns like “},{” which MessagePack doesn’t have. That against the fact that MessagePack has an large number of identifiers that mean “Array” because the identifier mixes in information about the length of the structure.
A friends list from Facebook would be an interesting subject too – since its a large array of 2-element arrays.
== Facebook friends list ==== MessagePack length: 21848 (9482 compressed - 43%) JSON length: 27653 (9149compressed - 33%)
As predicted, MessagePack does a better job compressing data like this – because its identifier for Array also contains the length of the array about to come. JSON is still smaller compressed here – but let’s push this further. Let’s throw a structure at each that is a 100-element array of 100 7-letter words.
== 100 groups of 100 7-letter words ==== MessagePack length: 80303 (49152 compressed - 61%) JSON length: 100201 (51726 compressed - 52%)
MessagePack compressed just as we thought – and now has edged out JSON. Let’s do the same with arrays of 7-digit numbers (just for fun)
== 100 groups of 100 7-digit numbers ==== MessagePack length: 50303 (36585 compressed - %73) JSON length: 80201 (36898 compressed - %46)
For me, Speed is where I’m really excited about what MessagePack could be capable of. So I started comparing the performance of the Ruby MessagePack library (written as a C extension) to YAJL (also with C bindings to YAJL).
On the same Facebook profile as above, I benchmarked encoding and decoding over 100,000 runs each.
== Encoding ==== user system total real json 6.360000 0.120000 6.480000 ( 6.468930) msg 4.480000 0.010000 4.490000 ( 4.481516) == Decoding ==== user system total real json 15.040000 0.270000 15.310000 ( 15.290871) msg 13.180000 1.530000 14.710000 ( 37.719052)
I will absolutely be using MessagePack when storing data either uncompressed, or with large amount of repeated structure sizes. It obviously won’t work for anywhere you’d like the data to be human-readable, but is an amazingly brilliant idea with great execution.
Check it out!
The code used to generate these numbers: https://gist.github.com/3188573