When I started my first job at Amazon.com, as the first analyst in the strategic planning department, I inherited the work of producing the Analytics Package. I capitalize the term because it was both a serious tool for making our business legible, and because the job of its production each month ruled my life for over a year.
Back in 1997, analytics wasn't even a real word. I know because I tried to look up the term, hoping to clarify just I was meant to be doing, and I couldn't find it, not in the dictionary, not on the internet. You can age yourself by the volume of search results the average search engine returned when you first began using the internet in force. I remember when pockets of wisdom were hidden in eclectic newsgroups, when Yahoo organized a directory of the web by hand, and later when many Google searches returned very little, if not nothing. Back then, if Russians wanted to hack an election, they might have planted some stories somewhere in rec.arts.comics and radicalized a few nerds, but that's about it.
Though I couldn't find a definition of the word, it wasn't difficult to guess what it was. Some noun form of analysis. More than that, the Analytics Package itself was self-describing. Literally. It came with a single page cover letter, always with a short preamble describing its purpose, and then jumped into a textual summary of the information within, almost like a research paper abstract, or a Letter to Shareholders. I like to think Jeff Bezos' famous company policy, instituted many years later, banning Powerpoint in favor of written essays, had some origins in the Analytics Package cover letter way back when. The animating idea was the same: if you can't explain something in writing to another human, do you really understand it yourself?
My interview loop at Amazon ended with an hour with the head of recruiting at the time, Ryan Sawyer. After having gone through a gauntlet of interviews that included almost all the senior executives, and including people like Jeff Bezos and Joy Covey, some of the most brilliant people I've ever met in my life, I thought perhaps the requisite HR interview would be a letup. But then Ryan asked me to explain the most complex thing I understood in a way he'd understand. It would be good preparation for my job.
What was within the Analytics Package, that required a written explanation? Graphs. Page after page of graphs, on every aspect of Amazon's business. Revenue. Editorial. Marketing. Operations. Customer Service. Headcount. G&A. Customer sentiment. Market penetration. Lifetime value of a customer. Inventory turns. Usually four graphs to a page, laid out landscape.
The word Package might seem redundant if Analytics is itself a noun. But if you saw one of these, you knew why it was called a Package. When I started at Amazon in 1997, the Analytics Package was maybe thirty to forty pages of graphs. When I moved over to product management, over a year later, it was pushing a hundred pages, and I was working on a supplemental report on customer order trends in addition. Analytics might refer to a deliverable or the practice of analysis, but the Analytics Package was like the phone book, or the Restoration Hardware catalog, in its heft.
This was back in the days before entire companies focused on building internal dashboards and analytical tools, so the Analytics Package was done with what we might today consider as comparable to twigs and dirt in sophistication. I entered the data by hand into Excel tables, generated and laid out the charts in Excel, and printed the paper copies.
One of the worst parts of the whole endeavor was getting the page numbers in the entire package correct. Behind the Analytics Package was a whole folder of linked spreadsheets. Since different charts came from different workbooks, I had to print out an entire Analytics Package, get the ordering correct, then insert page numbers by hand in some obscure print settings menu. Needless to say, ensuring page breaks landed where you wanted them was like defusing a bomb.
Nowadays, companies hang flat screen TVs hanging on the walls, all them running 24/7 to display a variety of charts. Most everyone ignores them. The spirit is right, to be transparent all the time, but the understanding of human nature is not. We ignore things that are shown to us all the time. However, if once a month, a huge packet of charts dropped on your desk, with a cover letter summarizing the results, and if the CEO and your peers received the same package the same day, and that piece of work included charts on how your part of the business was running, you damn well paid attention, like any person turning to the index of a book on their company to see if they were mentioned. Ritual matters.
The package went to senior managers around the company. At first that was defined by your official level in the hierarchy, though, as most such things go, it became a source of monthly contention as to who to add to the distribution. One might suspect this went to my head, owning the distribution list, but in fact I only cared because I had to print and collate the physical copies every month.
I rarely use copy machines these days, but that year of my life I used them more than I will all the days that came before and all the days still to come, and so I can say with some confidence that they are among the least reliable machines ever made by mankind.
It was a game, one whose only goal was to minimize pain. A hundred copies of a hundred page document. The machine will break down at some point. A sheet will jam somewhere. The ink cartridge will go dry. How many collated copies do you risk printing at once? Too few and you have to go through the setup process again. Too many and you risk a mid-job error, which then might cascade into a series of ever more complex tasks, like trying to collate just the pages still remaining and then merging them with the pages that were already completed. [If you wondered why I had to insert page numbers by hand, it wasn't just for ease of referencing particular graphs in discussion; it was also so I could figure out which pages were missing from which copies when the copy machine crapped out.]
You could try just resuming the task after clearing the paper jam, but in practice it never really worked. I learned that copy machine jams on jobs of this magnitude were, for all practical purposes, failures from which the machine could not recover.
I became a shaman to all the copy machines in our headquarters at the Columbia building. I knew which ones were capable of this heavy duty task, how reliable each one was. Each machine's reliability fluctuated through some elusive alchemy of time and usage and date of the last service visit. Since I generally worked late into every night, I'd save the mass copy tasks for the end of my day, when I had the run of all the building's copy machines.
Sometimes I could sense a paper jam coming just by the sound of machine's internal rollers and gears. An unhealthy machine would wheeze, like a smoker, and sometimes I'd put my hands on a machine as it performed its service for me, like a healer laying hands on a sick patient. I would call myself a copy machine whisperer, but when I addressed them it was always a slew of expletives, never whispered. Late in my tenure as analyst, I got budget to hire a temp to help with the actual printing of the monthly Analytics Package, and we keep in touch to this date, bonded by having endured that Sisyphean labor.
My other source of grief was another tool of deep fragility: linked spreadsheets in Excel 97. I am, to this day, an advocate for Excel, the best tool in the Microsoft Office suite, and still, if you're doing serious work, the top spreadsheet on the planet. However, I'll never forget the nightmare of linked workbooks in Excel 97, an idea which sounded so promising in theory and worked so inconsistently in practice.
Why not just use one giant workbook? Various departments had to submit data for different graphs, and back then it was a complete mess to have multiple people work in the same Excel spreadsheet simultaneously. Figuring out whose changes stuck, that whole process of diffs, was untenable. So I created Excel workbooks for all the different departments. Some of the data I'd collect myself and enter by hand, while some departments had younger employees with the time and wherewithal to enter and maintain the data for their organization.
Even with that process, much could go wrong. While I tried to create guardrails to preserve formulas linking all the workbooks, everything from locked cells to bold and colorful formatting to indicate editable cells, no spreadsheet survives engagement with a casual user. Someone might insert a column here or a row there, or delete a formula by mistake. One month, a user might rename a sheet, or decide to add a summary column by quarter where none had existed before. Suddenly a slew of #ERROR's show up in cells all over the place, or if you're unlucky, the figures remain, but they're wrong and you don't realize it.
Thus some part of every month was going through each spreadsheet and fixing all the links and pointers, reconnecting charts that were searching for a table that was no longer there, or more insidiously, that were pointing to the wrong area of the right table.
Even after all that was done, though, sometimes the cells would not calculate correctly. This should have been deterministic. That's the whole idea of a spreadsheet, that the only error should be user error. A cell in my master workbook would point at a cell in another workbook. They should match in value. Yet, when I opened both workbooks up, one would display 1,345 while the other would display 1,298. The button to force a recalculation of every cell was F9. I'd press it repeatedly. Sometimes that would do it. Sometimes it wouldn't. Sometimes I'd try Ctrl - Alt - Shift - F9. Sometimes I'd pray.
One of the only times I cried at work was late one night, a short time after my mom had passed away from cancer, my left leg in a cast from an ACL/MCL rupture, when I could not understand why my workbooks weren't checking out, and I lost the will, for a moment, to wrestle it and the universe into submission. This wasn't a circular reference, which I knew could be fixed once I pursued it to the ends of the earth, or at least the bounds of the workbook. No, this inherent fragility in linked workbooks in Excel 97 was a random flaw in a godless program, and I felt I was likely the person in the entire universe most fated to suffer its arbitrary punishment.
I wanted to leave the office, but I was too tired to go far on my crutches. No one was around the that section of the office at at that hour. I turned off the computer, turned out the lights, put my head down on my desk for a while until the moment passed. Then I booted the PC back up, opened the two workbooks, and looked at the two cells in question. They still differed. I pressed F9. They matched.
Most months, after I had finished collating all the copies of the Analytics Package, clipping each with a small, then later medium, and finally a large binder clip, I'd deliver most copies by hand, dropping them on each recipient's empty desk late at night. It was a welcome break to get up from my desk and stroll through the offices, maybe stop to chat with whoever was burning the midnight oil. I felt like a paper boy on his route, and often we'd be up at the same hour.
For all the painful memories that cling to the Analytics Package, I consider it one of the formative experiences of my career. In producing it, I felt the entire organism of our business laid bare before me, its complexity and inner working made legible. The same way I imagine programmers visualizing data moving through tables in three dimensional space, I could trace the entire ripple out from a customer's desire to purchase a book, how a dollar of cash flowed through the entire anatomy of our business. I knew the salary of every employee, and could sense the cost of their time from each order as the book worked its way from a distributor to our warehouse, from a shelf to a conveyor belt, into a box, then into a delivery truck. I could predict, like a blackjack player counting cards in the shoe, what % of customers from every hundred orders would reach out to us with an issue, and what % of those would be about what types of issues.
I knew, if we gained a customer one month, how many of their friends and family would become new customers the next month, through word of mouth. I knew if a hundred customers made their first order in January of 1998, what % of them would order again in February, and March, and so on, and what the average basket size of each order would be. As we grew, and as we gained some leverage, I could see the impact on our cash flow from negotiating longer payable days with publishers and distributors, and I'd see our gross margins inch upwards every time we negotiated better discounts off of list prices.
What comfort to live in the realm of frequent transactions and normal distributions, a realm where the laws of large numbers was the rule of law. Observing the consistency and predictability of human purchases of books (and later CDs and DVDs) each month was like spotting some crystal structure in Nature under a microscope. I don't envy companies like Snapchat or Twitter or Pinterest, social networks who have gone public or likely have to someday, companies who play in the social network business, trying to manage investor expectations when their businesses are so large and yet still so volatile, their revenue streams even more so. It is fun to grow with the exponential trajectory of a social network, but not fun if you're Twitter trying to explain every quarter why you missed numbers again, and less fun when you have to pretend to know what will happen to revenue one quarter out, let alone two or three.
At Amazon, I could see our revenue next quarter to within a few percentage points of accuracy, and beyond. The only decision was how much to tell Wall Street we anticipated our revenue being. Back then, we always underpromised on revenue; we knew we'd overdeliver, the only question was how much we should do so and still maintain a credible sense of surprise on the next earnings call.
The depth of our knowledge of our own business continues to exceed that of any company I've worked at since. Much of the credit goes to Jeff for demanding that level of detail. No one can set a standard for accountability like the person at the top. Much credit goes to Joy and my manager Keith for making the Analytics Package one of the strategic planning department's central tasks. That Keith pushed me into the arms of Tufte changed everything. And still more credit belongs to all the people who helped gather obscure bits of data from all parts of the business, from my colleagues in accounting to those in every department in the company, many of whom built their own models for their own areas, maintaining and iterating them with a regular cadence because they knew every month I'd come knocking and asking questions.
I'm convinced that because Joy knew every part of our business as well or better than almost anyone running them, she was one of those rare CFO's that can play offense in addition to defense. Almost every other CFO I've met hews close to the stereotype; always reigning in spending, urging more fiscal conservatism, casting a skeptical eye on any bold financial transactions. Joy could do that better than the next CFO, but when appropriate she would urge us to spend more with a zeal that matched Jeff's. She, like many visionary CEO's, knew that sometimes the best defense is offense, especially when it comes to internet markets, with their pockets of winner-take-all contests, first mover advantages, and network effects.
It still surprises me how many companies don't help their employees understand the numeric workings of their business. One goes through orientation and hears about culture, travel policies, where the supply cabinet is, maybe some discussion of mission statements. All valuable, of course. But when was the last time any orientation featured any graphs on the business? Is it that we don't trust the numeracy of our employees? Do we fear that level of radical transparency will overwhelm them? Or perhaps it's a mechanism of control, a sort of "don't worry your little mind about the numbers" and just focus on your piece of the puzzle?
Knowing the numbers isn't enough in and of itself, but as books like Moneyball make clear, doing so can reveal hidden truths, unknown vectors of value (for example, in the case of Billy Beane and the Oakland A's, on base percentage). To this day, people still commonly talk about Amazon not being able to turn a profit for so many years as if it is some Ponzi scheme. Late one night in 1997, a few days after I had started, and about my third or fourth time reading the most recent edition of the Analytics Package cover to back, I knew our hidden truth: all the naysaying about Amazon's profitless business model was a lie. Every dollar of our profit we didn't reinvest into the business, and every dollar we didn't raise from investors to add to that investment, would be just kneecapping ourselves. The only governor of our potential was the breadth of our ambition.
What does this have to do with line graphs? A month or two into my job, my manager sent me to a seminar that passed through Seattle. It was a full day course centered around the wisdom in one book, taught by the author. The book was The Visual Display of Quantitative Information, a cult bestseller on Amazon.com, the type of long tail book that, in the age before Amazon, might have remained some niche reference book, and the author was Edward Tufte. It's difficult to conjure, on demand, a full list of the most important books I've read, but this is one.
My manager sent me to the seminar so I could apply the principles of that book to the charts in the Analytics Package. My copy of the book sits on my shelf at home, and it's the book I recommend most to work colleagues.
In contrast to this post, which has buried the lede so far you may never find it, Tufte's book opens with a concise summary of its key principles.