A new iPhone PIM on the block – and it’s good!

After all my ranting and raving about ToDo versus ToodleDo, Pocket Informant comes along and changes everything.

Just a few months ago, I went to great pains to write about my anguish over choosing between ToDo and ToodleDo as my iPhone task manager.  So it’s rather ironic that I should be writing about something else now.  But write I must, cuz Pocket Informant – though not perfect – raises the bar in some pretty fundamental ways.

When I was using ToDo and ToodleDo, I was also using the iPhone’s built-in calendar app for personal appointments.  I had another system on my server, Calcium, on which I kept my work appointments.  Calcium is cool because you can configure it to allow anyone to add an event so long as that block of time was free.  So any student could book time with me without logging in or jumping through any other hoops.  The problem with this was (a) most students didn’t make an appointment – they just dropped in, and (b) I had to “sync” Calcium and my iPhone calendar by hand, copying appointments manually when needed.  This led to more scheduling malfunctions than I would have liked.

I’d looked at using iCal on my server, but it was just too hard to sync things with my laptop at home and my iPhone, without converting to MobileMe and whatever else Apple wanted me to use.

Then along came Pocket Informant.  It’s been around for other platforms for a while, but the iPhone app is quite recent.  PI includes a very full-featured task manager, and a full calendar system.  And it syncs tasks with Toodledo’s servers (like ToDo and ToodleDo) and calendar events with Google Calendar.  And I can access GCal from any browser, so syncing my laptop would be unnecessary.  PI gives me a wide variety of layouts, including a very useful Today view of what I have to do just today.  It supports both GTD and Franklin-Covey prioritizing systems.  It also has an integrated search capacity that looks through my calendar, my tasks, and my contacts too.  Another really cool feature is that it looks for the first reasonable task from each project that has been started but that isn’t complete, and creates a list of all next actions, which are then displayed on the Today view.  Of course, you can reorder tasks in any project, so the right item comes up as the next action.  Brilliant!

There were three things I had to give up, none of which are showstoppers for me.

  1. Letting students book themselves time with me.  Fortunately, as I’ve suggested above, this wasn’t a big deal.
  2. My iPhone is jailbroken, and I had bought an app called IntelliScreen that showed my calendar on the iPhone’s lockscreen.  With this app, I don’t need to unlock the phone every time I want to check my next appointment.  IntelliScreen only works with the native iPhone calendar app, so switching to PI meant giving that up.  I can say, however, that after a couple of months of living without IntelliScreen, I’m doing very well indeed.  So I guess I didn’t need that feature after all.
  3. Under iPhone OS 2.X, third party software can’t access the system to alert the user – so event alerts and reminders in PI’s calendar don’t work.  This would be a big deal if I depended on those reminders.  Fortunately, I don’t.  And it seems that in the next release of the iPhone OS this is changed, so it won’t be long before PI’s reminders work properly.

One thing I had to think through was how to deal with my personal appointments on GCal.  I want people to know I’m busy, but not necessarily why. So I set up a “personal” GCal calendar that I can edit from either my laptop, my server at work, or my iPhone, and since that calendar is configured to hide details, others only see that I’m busy.  Combined with my other GCal calendars, I get what I need.  Not as elegant as I would have liked, but plenty good enough for me.

There are some things about PI that bother me (given my experience with ToDo and ToodleDo).  Some things are just annoying, others are probably bugs.  But PI for the iPhone is only at version 1.02, so I’m willing to cut them some slack and give them a chance to sort it out.  (I should add that the differences between version 1.01 and 1.02 were huge and excellent, so I’ve got big expectations.)  A few of these things are:

  • The star icon for “starred items” is yellow, just like the icon used to indicate that a task has a “note” with it.  As a result, I often get confused between notes and stars.
  • There’s no way to choose which GTD features you want to use.  ToodleDo let you configure that yourself, to cut down on the size and complexity of the data entry fields.  In PI, you just have to ignore the fields you don’t want.  Which means you end up tapping the wrong item sometimes.
  • The order of next actions shown in the Today view appears random; I wish there were ways to sort those things in some way.
  • There’s no “fast” way to enter a task(1).  In ToDo, there’s a “lightning add” function that uses user-defined defaults for all task parameters except the task name.

So it turns out that neither ToDo nor ToodleDo are the right answer for me.  The ability to have a calendar and a task manager together just outweighs everything else.  So I’m with PI now.

Now, there is one more app, a real dark horse in my opinion, that has huge potential.  The app is SmartTime.  It has an impressive user interface: clean, simple, and usable.  It also has a fascinating way of arranging your tasks.  You tell it how long you want to spend doing something, and SmartTime will schedule it wherever in your schedule it can find the time.  If you don’t get to something in time – i.e. you don’t mark it complete – it can just bump it forward in time to the next available slot.  And it syncs both tasks and events into GCal alone.

I really, really like it, except for one thing that, for me, is a mortal flaw: each project requires you to create two Google calendars, and the process of connecting those calendars to the projects is not very easy – certainly not as easy as it should be.  Indeed, I’ve found it to be supremely inconvenient, especially as I keep a fairly large number of projects, and add new ones quite frequently.  This is the only thing that has stopped me from switching to SmartTime.  It’s too bad, cuz I love the user interface.

Anyways, there it is.  Even if you go through massive rationalization to decide on a good solution, you must always be ready for the alternative you never thought of till someone bring it up.

  1. Update 21 June 2009: Actually, PI v1.02 does have fast task entry, but it is active only in some folders, like the Inbox.  It is not available in project folders, which is were I do all my task entry, and which is why I didn’t notice the feature till late last night.

ToDo or ToodleDo, that is the question

Two “todo” apps are vying for my iPhone’s heart. Here’s how I decided on a winner.

I like PDAs because they help me manage the things I have to do – and I’m all about the “todo” lists. I don’t know if I’ve become dependent on lists because I have a bad memory, or if my memory is failing because I use lists for everything.  Still, it is as it is.

Over the past year or so, a number of todo apps have come out for my beloved iPhone, and I’ve been trying most of them. It’s surprising how I keep coming back to the same two apps, and equally surprising (to me) that after months of playing around with them, I still can’t quite decide which one I prefer.

The two apps is Appigo’s ToDo, and ToodleDo for the iPhone. Both cost only a few dollars, and both are very well-rated by the public at large.

So, I figured, lets use some design analysis tools to evaluate the two apps, and see what the numbers say.

I’m going to use two tools: pairwise comparison, and a weighted decision matrix. These tools aren’t only useful for analyzing designs – they’re basic decision-making tools, and they’ve always done right by me to evaluate designs, conceptual or otherwise.

Both tools depend on having a good set of criteria against which the two apps will be compared. You might not know what decision to make, but you need to know how you’ll know that you’ve made the right one. In our case here: How do I know when I’ve found a good todo app?

The formal term for what I’m doing here is qualitative, multi-criterion decision-making. It generally comes involves four tasks, which in my case are:

  1. Figure out criteria that apply to any “best” todo app.
  2. Rank the criteria by importance, because the most important criterion will affect my decision more than the others.
  3. Develop a rating scale to rate each app.
  4. Rate the apps with the rating scale and the weights.

Here’s my criteria, in no particular order of importance, based on years of using other task management tools:

  • Fast. No long delays when telling the app to do something.
  • Easy. Minimal clicking (e.g. hitting “accept” for everything or burrowing into deeply nested forms and subforms).
  • Repeats. Repeating items at regular intervals.
  • Priorities. At least three levels of priority for tasks.
  • Checkoff. One-touch checking off of done items.
  • Backup. Easy backup (or sync) to some remote server that is fairly robust, using standard formats.
  • Groups. Group items by tag or folder or project or whatever.
  • Sorting. Multiple ways to sort items.
  • Hotlist. Some overview page showing only near-term, important items.
  • Restart. Picks up next time I run it where I left off last time (oddly, not every iPhone app does this).
  • Recovery. Uncheck items that were accidentally checked off.
  • Conditional deadlines. Due dates based on due dates of other items (e.g. task B is due two weeks after task A is completed).
  • Links. Link an item to a folder of other items.

Oddly, not a single iPhone app I’ve checked out so far meets all my requirements.  In particular, I’ve not found any apps that even try to meet the last two requirements. I say “oddly” because I don’t think these requirements are excessive. Still, there it is.

Next, we have to develop weights to assign relative importance to the criteria. The word relative is key here; we’re not going to say that one criterion is certainly and universally more important than any other. What I want is to know how important each is with respect to the others and my own experience. Remember, one size never fits all.

This is where pairwise comparison comes in. Details on how this works are given in another web page (it ain’t hard).  The chart below is just the end results.  In each cell is the criterion that I thought was more important of the pair given by that cell’s row and column. Since it doesn’t make sense to compare something to itself, and since these comparisons are symmetric (comparing A and B is the same as comparing B and A), then I only need to fill in a little less than half of the whole chart.  If you’re thinking this took a long time, you’d be wrong. It took me about 15 minutes to fill in the whole thing.

Fast Easy Repeats Priorities Checkoff Backup Groups Sorting Hotlist Restart Recovery Cond. Deadlines Links
Fast Easy Repeats Priorities Fast Fast Groups Sorting Hotlist Fast Fast Cond. Deadlines Fast
Easy Repeats Priorities Easy Easy Groups Sorting Easy Restart Easy Easy Easy
Repeats Repeats Repeats Repeats Repeats Sorting Repeats Repeats Repeats Cond. Deadlines Repeats
Priorities Priorities Backup Groups Sorting Priorities Priorities Recovery Priorities Links
Checkoff Backup Groups Sorting Hotlist Checkoff Checkoff Cond. Deadlines Links
Backup Backups Sorting Backup Backup Backup Backup Backup
Groups Sorting Hotlist Groups Groups Groups Groups
Sorting Sorting Restart Sorting Sorting Links
Hotlist Hotlist Hotlist Hotlist Hotlist
Restart Restart Cond. Deadlines Links
Recovery Cond. Deadlines Links
Cond. Deadlines Cond. Deadlines

This leads to the following weights:

Fast 6%
Easy 9%
Repeats 13%
Priorities 8%
Checkoff 3%
Backup 10%
Groups 10%
Sorting 13%
Hotlist 9%
Restart 4%
Recovery 1%
Cond. Deadlines 8%
Links 6%

So this tells me that I think having repeating tasks and good sorting of items are the two most important criteria.

The point of this process is that the human mind is not good at juggling a bunch of variables, but it is very good at comparing one thing against another. Take the trivial case of choosing between three alternatives, A, B, and C. If you prefer A to B, and B to C, then you should accept the logic that A is the most preferred item.  To do otherwise just isn’t rational.  That’s exactly what pairwise comparison does. And there’s good evidence that this technique actually works.

The next step is to choose a rating scale.  This scale will be used to rate each app with respect to each criterion.

There’s a variety of scales I could use, and a great deal of research into qualitative measurement scales has been done.  The scale that works best for me – and seems to be the most general – is a five-point scale from -2 to +2, where 0 means “neutral,” -2 means “horrible,” +2 means “excellent,” and -1 and +1 are in-between values.  If you prefer something a little finer, you can use a 7-point scale from -3 to +3.  I think it’s important to have  a zero value to indicate neutrality, and I find it meaningful to have negative numbers stand for bad things and positive numbers for good things.

It’s interesting to note that in some industries (e.g. aerospace), I’ve noticed a tendency to use an exponential scale – something like (0, 1, 3, 9).  This is because aerospace people tend to be extremely conservative (for reasons both technical and otherwise), so they tend to underrate the goodness of things.  This scale inflates any reasonable rating to make up for that conservatism.

But I’m neither an aerospace engineer nor particularly conservative, so I’ll use the -2 to +2 scale.

Now we can do the weighted decision matrix. The gory details are given elsewhere.  The weights come from the pairwise comparison above.  In a decision matrix, we rank each alternative to some well-defined reference or base item.  We need a reference because we need a fixed point against which to measure things.  If we were evaluating design concepts, none of them would be suitable as references since a “concept” design is not well-defined.  In this case, we’re evaluating two existent web apps, so we can choose either one of them as the reference.  For no particular reason, I’ll use ToDo.

I worked up a weighted decision matrix comparing ToodleDo to ToDo.  Here it is:

Reference (ToDo) ToodleDo
Weight Rating Score Rating Score
Fast 0.06 0 0 0 0
Easy 0.09 0 0 -1 -0.09
Repeats 0.13 0 0 0 0
Priorities 0.08 0 0 0 0
Checkoff 0.03 0 0 0 0
Backup 0.10 0 0 -1 -0.1
Groups 0.10 0 0 0 0
Sorting 0.13 0 0 1 0.13
Hotlist 0.09 0 0 1 0.09
Restart 0.04 0 0 0 0
Recovery 0.01 0 0 0 0
Cond. Deadlines 0.08 0 0 1 0.08
Links 0.06 0 0 0 0
0 0.11

This table might not look like much, but it tells a bit of a story.  ToDo is the reference, so I’ve given it zeros in every category.  That way, when I compare ToodleDo to it, a positive number means it beats ToDo and a negative number means it’s worse than ToDo.  Obviously, they’re very close to one another.

If you look at the ratings for ToodleDo, you see that it’s a bit better than ToDo on some points, and a bit worse on others.  But the +1’s don’t actually cancel out the -1’s because of the weights.  The criteria on which ToodleDo beat ToDo are more important to me than the others, because the weights are higher.  That makes ToodleDo just a little bit better than ToDo.

And that jives nicely with my intuition.  I got ToDo first, and enjoyed it.  But ever since I got ToodleDo, I’ve preferred it.  Every once in a while, I switch back to ToDo, but it never lasts very long.  And up until I did this decision matrix, all I had was a vague intuition that ToodleDo was better for me; now, I actually have an explanation.

But there’s a problem.  ToDo handles repeating events internally; that is, when I check off the current instance of a repeating event, ToDo immediately creates the next one in the series.  ToodleDo, on the other hand, generates subsequent repeating events only when you sync the app with the ToodleDo website.

This is a problem for me when I travel.  I was in Berlin recently, for a conference.  And I don’t have a data plan for my iPhone (that’s a whole separate story), so I couldn’t sync either app.  But that means ToodleDo  couldn’t roll repeating items over properly.  So before I went to Berlin, I sync’d up ToDo and used it while I was gone.  When I came back, though, I switched back to ToodleDo.  When I go to Sweden at the end of March, I’ll be using ToDo again.

Does the evaluation consider that?  No it doesn’t, because I didn’t.  The evaluation is only as good as the evaluator.  When I evaluated the two apps, I was nestled snugly at home, WiFi at the ready – and sync’ing either ToDo or ToodleDo is a non-issue.  If I’d’ve done the evaluation in Berlin, I’m sure I’d’ve gotten different numbers, because the repeating events problem would have been right there in my face.

So this underscores a limit with the evaluation method – indeed, a limit with any method: it’s only as good as the situation you’re in when you use it.  Some people might say a method is only as good as the information you use, but it’s more than that.  My situation, in this case, includes me, my goals (at the time), my experiences, all the information I have handy, constraints, and anything else can possibly influence my decisions at the time.

The problem, then, is that a method depends on the situation when it’s used.  But that situation may be different for the person doing the evaluation than for the person(s) who will have to live with the decision being made.  Indeed, it’s virtually guaranteed that the situations will be different, if for no other reason than the implications of a decision will only occur later.

Does this put the kibosh on these kinds of methods?

Not at all.  It just means that we must be vigilant and diligent in their application.  If I did the evaluation in Berlin, ToDo would have won, because in that situation, ToodleDo would have scored poorly on repeating events.  This is as it should be.  That means that in the two different situations, the method worked.  The problem is that in any one given situation, there’s no way to take into account any other situations.

Happily, there is fruitful and vigorous research concerned exactly with this.  Some people call it situated cognition; others call it situated reasoning.  We’ve not yet figured out how to treat situations reliably, but I think it’s only a matter of time before we do.

In the meantime, there is at least one other possible way to treat other situations.  A popular technique to help set up a design problem is the use case (or what I call a usage scenario).  These are either textual or visual descriptions of the interactions involved in using the thing you’ll design.  They can be quite complex and detailed.  Usage scenarios try to capture a specific situation other than the one that includes the designers during the design process.  So it’s at least possible that usage scenarios could help designers evaluate designs and products better.

One final caveat: this evaluation is particular to me.  It is unlikely that anyone will agree completely with my evaluation, because their situations are different from mine.  So I’m not saying ToodleDo “is better” than ToDo.  I’m just saying it seems to be better for me.

As they say: your mileage may vary.