Using Spammer Techniques for Good in your Investor Pitch

Why are spam emails horribly misspelled and in general terribly written? Surely they want to be taken seriously by their readers. It's not an accident, and it's not that they can't write any better. They are consciously filtering for people who are good targets for the next steps of their scams. If you're turned off by bad grammar and writing, you'd definitely be turned off by what will come next anyway, but you will have wasted some of their time in the process. So better cut you off at the top of the funnel. Microsoft Research actually wrote a great paper on the subject.

How does this apply to an investor pitch you ask? As an entrepreneur pitching a venture, you are often pressured to present your startup in the broadest, most attractive, most polished terms. This can be of use when it reflects reality on the ground, but the counterintuitive advice we gain from the spammers is that flaws, or even just quirky/idiomatic elements are worth exposing in a pitch and deck. If the investors would care about them, they will find out and drop out anyway. If they continue talking to you, you have a much more qualified pipeline that understands the implications, and discussions are much more likely to come to fruition.

But you also save yourself from a more insidious failure mode. If you are unclear about your roadmap, your philosophy, or your plans to develop the business, and encourage an investor to believe what they want to believe, you have just signed up for a lot of pain at the first sign of trouble or earlier. If the investor believes that the technology is "almost there" or that users are signing up in droves when they're not, then you may get their money, but perhaps you'd rather take the money of someone who believes in the vision. If they signed up for a social enterprise and you intend to maximise for profit, there is bound to be hurt feelings and mistrust, justified even. This is the counterintuitive advice against over-optimising your deck and pitch.

This actually applies to any "sale" whatsoever, from hiring, to partnerships, to finding a romantic partner. Going too far in the other person's direction during the sale is likely to cost you farther down the line. Better make your true self apparent early on and save yourself a bad match. You may spend a little more time searching, but that search will pay off much more handsomely.

So there you have it. The importance of being honest, in game theoretic terms, as demonstrated by spammers.

Bridgebuilding through the lens of Lean Startup

Bridges are the most un-lean kind of thing to build. You work for months and months (or even years), and at any given point in the process, have about 0 throughput.

Mayor: "You've spent quite a few millions, and have been working for 8 months now. I see quite a bit of scaffolding and materials are in place. How many people crossed the bridge today Mr. Architect?"

Architect: "Well, zero. You see the bridge isn't done yet. When it's done we expect thousands of people to cross every day"

Mayor: "Enough with the waterfall thinking! You should be building incrementally! You're not even measuring any KPIs are you? You've made no progress since we funded the project, and you expect us to keep paying the bills?"

If funding decisions for bridges went this way, we wouldn't have too many of them. Thankfully, a bridge is a well understood concept, and we can work around the skepticism by pointing to other examples.

But what about the first bridge? It would have to be funded and built by people that understood the principles of what was being built. Even when parts of it collapsed during construction (part of building the first of its kind is that you don't know how), they would have to persevere, review their assumptions, and press on.

This exact situation, but worse, actually happenned with the Hagia Sophia in Constantinople. The third (and current) version of the building was started on 532 CE, completed on 537, after almost 6 years of work. Earthquakes on 553 and 557 made its dome collapse, due to over-ambitious design. The nephew of the original architect had to come back to fix the dome, increasing its curvature, adding more supports, and using lighter materials.

Since its completion, for 800 years, the Hagia Sophia was the largest enclosed building in the world. The Statue of Liberty can fit beneath its dome with room to spare.

Imagine seeing this vast project, 25 years after the start of its construction, with a collapsed dome, and trying to measure its success. If the decision to continue or not was left to someone thinking metrics, they wouldn't have much to go on.
They might point to it as a huge waste of resource, and humanity would have lost perhaps the most important building of the first millenium CE.

The same "bridge" pattern applies to nuclear power, flight, spaceflight, the human geonome project, the LHC, and Tesla cars. What they share is a huge upfront cost, and a huge payoff after the fact, and the inability to get this value by only doing part of the work.

Now apply this thought to startups. I sometimes wonder what companies we're not building due to our obsession with metrics and incremental progress.

On startups and advice

Giving and receiving advice is an extremely hard thing to do, even when both parties have the best intentions. In my startup career, I've frustrated many well-intentioned and extremely capable advisors over the years, by refusing to do things I don't understand the reasoning behind.

I think I recently had an insight that explains my instinctive reaction, and shows a better way to giving and receiving advice.

Imagine me playing a game of chess (and I'm really not good at chess), getting myself into a difficult situation, when suddenly Garry Kasparov enters the room. "Hey Garry, can you give me some advice here?" I shout. "Sure", he answers. He walks over, takes a look at the board, and tells me the next three moves he would make if he was faced with this board. He then walks off to wherever he was going to in the first place.

What good fortune, having Garry Kasparov play a few moves for me. Surely I'll destroy my poor opponent now. But wait -- why did he make those moves? And how do I make the most of the resulting situation? How do I defend against the weaknesses of the resulting position? How do I react to my opponent's various options? Well, Garry's not here now, so all I can do is to potentially make things worse for myself by trying to bring the board back to a situation I understand.

This is how a lot of advice-giving feels to me, having been both on the giving and receiving ends. People may suggest things I should do, but I don't understand why those options are better than others. And even if I follow them, I won't be able to explain to the team why with a convincing narrative. And even if they go along, we won't be able to make the most of the energy we put in, as we'll be essentially cargo-culting it.

If you're giving advice, the best you can do is not to suggest the "best move", or what you would do in the other's place. The absolute best is to suggest the best move that the other person can fully understand and exploit, even if they wouldn't have thought of it themselves. If you're particularly wise, you may suggest courses of action that will leave the receiver better off, but not for reasons they understand now (but will later). E.g. encouragement to start a startup was some of the best advice I ever received, even if my current venture ends in no big exit. This is because the process itself has made me immensely more capable and experienced than the alternatives would have.

If you're looking to receive advice, unless the advice-giver understands (or can be made to understand) the above, it follows that maybe the best advice wouldn't come from a Bill Gates or Elon Musk afterall, who think in terms we can't even begin to comprehend, but maybe from someone a little better than you, who's recently been where you were.

I should note in closing that all the usual defenses against bad feedback should still hold. Don't take one person's success as proof that everything they did was right, unless you've seen everyone else who's tried the same thing.

npm now the largest module repository

I've been stalking npm for a while now, and as of June 30th, it is the largest of its kind, having surpassed Rubygems and Maven Central in quick succession.

I post this as an update to my prediction last December, that npm would be the largest repository of its kind by February 2015. It felt kind of bearish at the time, but not THAT bearish. npm is not only the largest repo of its kind. It's accelerating.

(screenshot from modulecounts.com)

Certainly many will now say that this was obviously coming for a while, and therefore that this is an arbitrary event offering nothing new. To some degree I am in agreement with that sentiment, but I happen to believe large shifts in our world happen gradually to the point where there is no 'big bang' event that signifies their arrival. In situations like that, we only have these 'trivial' events to latch on to as the dates when something was indisputable. Besides, 'it's the largest' is much easier to understand than 'is on track to become largest real-soon-now if assumptions hold'. And that by itself means even more teams will jump on the bandwagon, further accelerating the trend.

Others will say that the comparison isn't fair. The node community is known for favouring small composable components over large monolithic ones, so the modules in npm should count for less. Instead of wading into the details of the argument, I'll offer this simple response: Assuming that each npm module counts for a tenth of a 'normal' module, does nothing to alter the slope of growth. As long as the slope holds, if the day where npm exceeds every other module repository isn't today, it will be tomorrow.

I guess the final response is that it doesn't really matter. Again, in a sense I agree. But in another sense, even if 90% of everything is crap, there is value in understanding towards where the winds are blowing. Even if you won't abandon your favourite language for JS any time soon, it helps to know how to pitch your language when your target market is evolving.

All in all, this could be a meaningless factoid, but I still think it's a pretty neat feat for a language nobody took seriously for the first several years of it's existence.

Breaking the rules: Rendering JSON, with only a recursive AngularJS template

Being the "business guy" at my startup means that when I code, I do it for fun. And being a former academic, means that when I want to have some fun coding, it turns into research.

One of the things I love about AngularJS is how little code it takes to do cool things if you're doing it right, and how it extends the declarative nature of HTML for the webapp era. So, when I do fun coding research, I try to see how far I can go with it. One of my favourite recent creations, basically contains no code at all. But "basically" is a bit different than "no code at all", so there's still room to improve.

I've always thought that JSON should be super-easy to visualise, so I've given it a go with Angular. But not just any Angular approach would do. I decided to base it on some previous work I'd done with recursive templates. In the process of getting that to work and producing a very elegant result (if I may say so myself), I had to break or bend several "best practices" of AngularJS. My view is that these breaks are justified, the kind of rule-breaking that is permitted if you can clearly argue your thought process. Let's start with the code, and we'll get to the arguing afterwards. 80 lines of Angularized HTML and JavaScript that will display any JSON file you point them to.

Mouse over the example below for a cool little effect. You can see the code by clicking on the "Code" button on the top right of the window below.

So how is this breaking the rules? Let me count the ways:

1. Disabling the digest limit

Notice the little line that writes

$rootScopeProvider.digestTtl(Infinity);

That line basically allows us to render a JSON file as deep as we like. The catch is that we need to suspend a limit that was put there for a reason. AngularJS has a digest limit because infinite recursion happens really easily when you do things wrong. But we know that this code doesn't infinitely recurse. It just recurses deeper than the 10-digest limit, with the example JSON file I'm using. And it could go deeper if given a deeper nested file.

Isn't there a way to work around this? Yes there is. The way is to create a specialised directive that does the job of the template. But that would be equivalent of respecting the rules for their own sake. And when we're doing fun coding for research purposes, respecting the rules is the one thing we don't do, especially for their own sake. Nevermind the elegance tax that would entail.

2. Value in value. What?!

If you look at lines 68 and 75, you'll see a somewhat odd formulation:

ng-repeat="(key, value) in value"

This is outrageous! Any programming language would choke at this obvious circular reference. Or is it?

Angular has an interesting way of parsing the 'in' operator, which we exploit to make template recursion work. The left side of the operator, in our case (key, value) gets evaluated in the context of the child scope that is created for each iteration. On the other hand, the right side, in our case value, is understood to be in the parent scope. As such, not only do the two instances not clash, they actually enable the recursion, since one scope's child is another child's parent.

3. ng-init

The sinful lines 68 and 75 also read:

ng-init="groupValue = $parent.value"

ng-init is not really supposed to be used, except to alias things in combination with an ng-repeat. Since these two lines do indeed also have an ng-repeat, this isn't such a terrible violation. But it is cheeky. What I'm doing there is to pass the child an easily findable reference to its parent. You might ask why I'm doing this, since I could just use $parent? Well, we actually create 3 layers of inheritance for every level of nesting in a JSON file. This means that to reach the actual parent from where I need it, I have to do something like $parent.$parent.$parent. And this isn't just ugly. It's fragile, as it depends on the levels of inheritance being exactly three. By creating the groupValue reference, I'm able to reach the greatgrandparent with a simple name, even if I'm unsure how many generations actualy intervene.

4. Writing on the parent

But why do I want to reach the greatgrandparent you ask? Isn't that in itself a sign of trouble? Well, I'm glad you asked! As you may know, Angular uses vanilla JavaScript objects and protptypal inheritance for its scope hierarchy. This is super cool and super easy to set up, until you try to write on a property your current scope inherited from a parent scope. When you do, the property becomes detached and receives no further updates should it change in the parent. Further, the changes you make in the child don't get reflected up the chain. The normal remedy (hack-around) is to not store direct values in scope properties, but to store object references and put the values there. But this doesn't work with our value in value hack, so here we are. Writing to the parent means there is no more breaking of the inheritance hierarchy,

What this in turn means is that not only do we render our JSON file, we can actually edit its values, and those values will be reflected in the property that holds the deserialised JSON file we used to begin with. We can then send that back wherever we got it from, and it will contain all the edits. Not bad for an 80-line snippet, eh?

5. Gratuitous rule-breaking: window.s

Since we've reached proper first-world anarchy levels of rule-breaking, nothing's stopping us now. I've included one of my all-time favourite hacks:

window.sc = $scope

Putting this in the root controller gives us easy access to the scope through the console. Nothing better when you quickly want to see what's on the scope for debugging purposes. Yeah, you could access it using a much longer manoeuver, but why bother? In our case, you could type sc.value to quickly verify that indeed the two-way binding is preserved despite the multiple levels of inheritance.

If nothing else, I hope the above piece of code will help you understand more about the inner workings of AngularJS and how/why it works like it works. Do I advise you use these tricks in production? Some yes, most no. If anything, it will confuse everyone else and you'll need to have a long conversation about it.

Hope you had fun, hit me with your comments!

Lessons from working 6 months on a math problem (and failing)

I'm a coder, most certainly not a mathematician. But when I saw the 17x17 challenge in late 2011, I couldn't resist having a crack at it. You can read a bit more on the problem itself here. Each day, I couldn't resist spending another day on it. Long story short, despite some temporary success, I didn't end up solving the problem. But I consider that time spent, about 6 months of full-time effort, to have been an incredibly worthwhile investment. Below is the furthest I got, a grid with 3 monochromatic rectangles (every corner of the rectangles marked is on the same colour). Still not zero which was the goal, but for a few months, it was the 'least broken' solution known.

It might not look like much, but it took crazy amounts of time to find the right colourings for those damn cells. The solution ended up being found by means of a SAT solver, but here's what I learned in the process of not finding it:

1. A problem which is easy to state can addict quite severely

I was able to get some people hooked on the problem just by explaining it to them in a few minutes. The same process probably happened with me and lots of others on the internet who got caught up with this problem over the years it took to solve. Maybe there's something there to learn about motivation, or perhaps asymmetries like this are just a rare occurence.

2. Optimisation with instant feedback is incredibly addictive

As I started writing code for this problem, I found out I could work on it for hours on end without distraction, quite unusual for me. I chalk this up to a tight feedback loop. Have an idea, implement it, get back a number. Think of ways to improve the number, start the cycle all over again. This is a cycle I would run dozens of times a day. Obviously more fundamental ideas would take more time to see the fruits of, and that's when I actually lost focus, but when the brain has been rewarded so richly with small cycles, it can afford to go a bit longer without reinforcement. This tells me that A/B testing or a similar numeric optimisation area would be quite motivating to me and I've made a mental note to go into this in the future.

3. There are some really big numbers out there

The space of possible colourings for the 17x17 problem is almost 10^174. Comprehending this number is beyond the human mind. If you took the number of all particles in the observable universe and squared it, you would still be a factor of 10^14 off.

While I can't say I grasped that number in the slightest, I do feel it has recalibrated my sense of what 'big' is. The earth's population which nears 7 billion now feels like a decidedly small number.

4. The value of optimisations at different levels of the stack

Once my basic strategy was set, a lot of the work came down to optimisation. I categorise that work into three fairly separate categories: mathematical, algorithmic, and implementation/hardware.

By mathematical optimisations, I mean ones where discovering a symmetry in the search space allowed me to prove that solving one case equated solving a whole class of cases, as each member of the class was equivalent to every other member. If I discovered a symmetry that folded 32 cases into one, I automatically had a 32x speedup.

Algorithmic optimisations are more pedestrian. A certain computation needs to be made, and finding a better way to compute it means speeding up the whole process, since as a rule these computations ran millions of times. By the end, I realised that indeed the best way is to have the code do almost nothing.

Implementation optimisations are ones that have to do with better mapping to hardware. For instance at some point I decided that Python wasn't going to cut it and I'd need to drop to C. By translating my algorithms unchanged into C I got about a 10x improvement. This was a very naive translation done in a day, without me having much experience in C past some classes in university.

Of course, I also cheated. I realised that my code spent a huge amount of time counting the set bits on a byte. I could implement that on my side, or get a CPU that implements that as a single instruction. I did spend quite some time implementing faster versions of it, mostly ending up with lookup tables, but at the end of the day, the i7 family of CPUs have POPCNT as a builtin. Moving to the i7 caused a big speedup, and using the popcnt builtin was another large boost to speed. Unfortunately I don't have exact data but it was certainly integer multiples of the previous level of performance.

The vast array of optimisation techniques I could bring to bear was definitely a surprise. Generally the mathematical optimisations tended to dominate whenever I could get them, but each optimisation level in aggregate yielded many orders of magnitude worth of improvement in my final result. While I don't have exact numbers, I do recall that a single run went from around 4 minutes to tens of ms, and then the mathematical optimisations reduced the amount of runs needed from a theoretical 2 * 10^20 to about 100 million. All in all, by the end I was able to do a complete run in a bit over a day on 2 cores of an i7. That run produced the matrix above, but unfortunately, no solution.

5. The interplay between thinking in code and thinking on paper

It's clear that an algorithmic breakthrough can be worth 1000 micro-optimisations. But sometimes the opposite happened. Glancing through the results of an algorithmic run showed a pattern I wasn't aware of. Taking that back to paper allowed me to make further theoretical breakthroughs that sped up runtime by additional orders of magnitude. So getting my hands dirty with code, besides giving non-trivial speedups, also hinted at mathematical symmetries not to be sneezed at.

6. Deep understanding of combinatorics and symmetries

In other words, I learned some math on the intuitive level. My knowledge of symbols is still probably poor, but I do have a mental toolkit on combinatorics developed that I trot out once in a while, and will probably invest more in in the future. I hear combinatorics is a good avenue to probability, and probability is something I really want to get good at.

7. Intuitive understanding of trees

Working through combinatorics, one cannot avoid getting very familliar with the tree data structure. It's not that I couldn't understand a tree before, but there is a whole other level of 'burnt in' that something can get when you keep digging those neural pathways day in and day out. It helped me work on parsers and grammars soon after, and to this day I think I use the tree structure to consider possible avenues of action in my current role as 'business guy'.

8. If your approach is wrong, hard work doesn't matter

Despite the enormous amount of work I put in, I bet on an early assumption that I knew would make or break the effort. I did this knowingly as I felt I couldn't beat more experienced mathematicians without cheating somehow. My assumption ended up being wrong, and therefore my code yielded no results. No matter how hard I worked, it wouldn't have made a difference.

This is always something to keep in mind when working on startups. If your high-order bit is set wrong, you can optimise the lower ones to infinity and while the final cause of death might be 'ran out of money', the effort may be finished before it starts for reasons like 'chose too small a market'.

9. If you're passionate enough, it may be the direct results that don't matter.

This is not to say that just because you might fail you shouldn't try. While working on this problem, I knew full well I could and probably would fail. But doing a mental check on the pleasure and learning I drew from the excercise, I constantly came up positive, such that actually finding a solution would be the icing on the cake. I feel the same way about my startup. I want it to succeed, in fact I'm doing my absolute best to ensure it does, but at the same time I know I'm already on the plus side in terms of experience gained such that if it all was to evaporate tomorrow, I'd still have been more than adequately rewarded for my efforts.

10. The immense joy of shutting everything out and falling into absolute focus

I consider my focus to be very fragile. It has taken many years of trial and error (or just aging) to approach some stable level of productivity, and even this is definitely not 100%. While working on the 17x17 however, I was able to experience something completely new for me: absolute focus approaching bliss. Some times I'd start working in the morning, and when my girlfriend would come back home in the evening I would feel like I'm waking up into reality, only to spend a few hours resting to start again next morning. There's something special to be said about that, regardless of all the other lessons above. My girlfriend did mention that I may be slipping into Uncle Petros territory, but thankfully I was able to stop on time to finish my PhD and move on.

I'd like to avoid closing a cliche worthy of a whiskey commercial such as "...Because sometimes, the journey is worth more than the destination...". Instead, I'll just say that I'd trade several of my minor successes for another failure this good. Maybe another problem will come and capture my imagination in a similar way in the future.

Fun aside: HN user TN1ck did up the actual 17x17 solution in d3.js and SVG/AngularJS, using the technique described in my previous post. Cool stuff!