The Feynmanization of Chapter 1

As I discussed in a previous post, I want to emulate the admirable clarity and accessibility of Feynman’s Lectures on Physics in my own attempt to write an introductory textbook on information metrics for statistical inference.  Below are my thoughts on how I can apply the lessons that I drew from Feynman in my previous post.

More to the point, I’ve rewritten Chapter 1  What is Inference? based on these lessons.  So now I ask you: is this a genuine improvement?  Note that this is an intro chapter with only the simplest math (some addition and multiplication), so anyone should be able to understand it and critique it!  Please add comments to this post to give your opinion of whether you think the specific changes I outline below improve the chapter, compared with the original version.  I am particularly interested in both whether you think the ideas in my plan are the right direction to pursue, versus whether their actual “reduction to practice” in the new draft chapter works or not.  Above all, tell me how I need to improve my chapter and my writing!

How can I apply Feynman’s lessons to my own writing?

  • Short, self-contained chapters following Feynman’s question and answer model. I’ve struggled a bit over where to put some topics in my current “whole subject” chapters. I think these topics (e.g. Law of Large Numbers; basic info theory relations, etc.) will work better as self-contained chapters, which also make the precise order of topics less important. This seems like a useful goal. It will force me to relate each topic to my fundamental questions. It makes the material more accessible, like John Baez’s introductions in each This Week’s Finds. Reducing the stress on exact linear order will make it much easier to use this material in multiple formats, and to get feedback from other people. (I’ve never been able to get feedback on more than one chapter, from anyone). Finally, it’s probably an egomaniacal delusion to stuff a whole subject entitled “Inference” into one chapter. This layout pretends that each subject can be treated separately. A larger number of short chapters may actually work better: while each chapter is self-contained, it will discuss fundamental questions that connect it to many other chapters.

  • I can rework much of my material into dialog form. For example, when I introduce the Monty Hall problem, I say there are two points of view on Monty’s “twist”: it doesn’t make any difference: some people find the whole question odd… vs. it makes a difference: some people make the argument that…. This cries out to become an actual dialog (with named characters, quotes, exclamation points!).

  • I definitely have a mild case of math-envy, and go out of my way to show that I can drop some equations where relevant. The problem is, this turns into a conventional authority / soporific effect. I’m not sure how to strike the right balance. I guess you just ask yourself what’s truly needed for learning the material best. For example, my little section on rearrangement rules seems more “optional” than “needed”.
  • I need to define for my reader a “science of information” that is not just math. The subject matter (abstract information; statistical inference) implies that it is pure math, but I’m not sure that’s right (especially for me!). The whole point of my approach is to redefine information as empirical. This is not just a methodological detail. First and foremost, it’s my personality. I’m a scientist in outlook, not a mathematician. I can develop these ideas scientifically (i.e. “thought experiments” or computational experiments, or using math as a model just as people do in physics or any other science). It would be better for me to focus this text on my strengths, taking full advantage of the scientific approach that I can contribute, rather than allowing this to become a watered-down math textbook. What are the elements of my “science of information”?
    • the key principle of operational definitions

    • the fundamental distinction between observable vs. hidden states as the foundation for all inference.

    • correspondingly, defining all information as empirical, i.e. measuring our ability to predict observables.
    • consequently, the focus on being able to compute: first, numerical models of actual data; second, measurements of information yield.

  • An essential corollary is that I must articulate the fundamental questions of this new science. Each chapter introduction will raise these questions in dialog form, showing that our thinking about basic things is confused and incomplete. This will be a huge improvement over my existing introductions, which tend to be pompous, empty statements of grand intentions. Actually, it would do me a lot of good to have clear statements of these fundamental questions. It will probably turn out that I am just as confused as everyone else, even about how to articulate what the key questions are. Let’s try a few obvious questions:

    • is there a quantitative, scientific theory of information? Can “information” be measured in a way that solves problems generally?
    • how can we establish a clear framework of operational definitions for the phenomena that matter in this field? E.g. “observable” vs. “hidden”; “prediction” in the absence of an “objective observer” etc.
    • what “counts” as information? This has a number of confusing aspects (randomness vs. uniformity; “hidden” information; mutual information; definition of “prediction”), and is the question that empirical information answers.
    • how can we do experiment planning (before we know the result!)?
    • what are the requirements for sustainable (unbounded) information production? What is the difference between biological evolution and, say, clouds forming in the sky (or geological formations underground)? What are the requirements for scalable information production, i.e. a powertool?

    • Can this be made a simple, general procedure for discovering any information structure, or solving any problem?

4 Responses to “The Feynmanization of Chapter 1”

  1. John Baez Says:

    Hi! I got your email - I’ll check out this new draft and make some comments in a while. I’ll have to add this blog to my list of places to visit.

    (Lisa and I have been in Paris for most of the summer, but will return to California tomorrow.)

    Don’t succumb to math envy! Or if you do, imitate those mathematicians who explain things using words rather than equations. :-)

  2. John Baez Says:

    Hello again. I just took a look at your new draft. It’s vastly more appealing than the old one!

    I like the dialogues. You can have a lot of fun with them if you have the same characters appear repeatedly, with their own distinctive personalities - illustrating different cognitive tendencies, or different aspects of your own thinking. Done right this sort of thing is very engaging, since we’re hard-wired to be fascinated by real people and their conflicts.

    It’s a real art to bring a taste of “real life” into such dialogues without distracting attention from the points you’re trying to make. Ideally, the characters’ personalities can be very helpful in presenting different points. For example, it makes sense to have a mathematician named Matt (cute) who is good at rigorous reasoning but a bit naive when it comes to epistemology: he can help out when math is needed, but also serve to illustrate the dangers of certain kinds of naivete. You could have a frequentist and a Bayesian to carry out two sides of that battle… after all, I think part of your book is about the limitations of “subjective Bayesianism”. Sonya shouldn’t be just “a scientist”, but some specific kind of scientist: you may want a number of different kinds: biologists, computer scientists, physicists and so on — each with their own strengths and weaknesses. Since you’re a Renaissance man, you could use quite a cast of characters.

    You write: “Most of us have learned about mathematical logic, theorems and proofs in math class.” Ah, if only that were true! :-) It depends who your audience is, I guess.

    Using the Monte Hall paradox to introduce Bayes’ law is good, but maybe you should start with an easier example, like: “if the probability that you die of cancer given that you smoke is 20%, and the probability that you smoke is 10%, and… then… ?”. In other words, a simple “plug and chug” example, just to illustrate the formula and nail down the concepts in the simplest possible way.

    But, you can probably find a real-world example that seems quite striking! People tend to get mixed up between “the probability that P, given Q” and “the probability that Q, given P”, and get shocked when these are drastically different. Getting over this seems to be crucial to getting good at statistical reasoning.

  3. Robert Cadena Says:

    The link for the new chapter returns a “404 — File not found.” This is the url the link points to: http://thinking.bioinformatics.ucla.edu/files/2008/08/chapter1.pdf

  4. leec Says:

    Thanks, Robert. The blog was recently moved to a new server and the file attachments don’t seem to have come along for the ride. I have relinked to another copy of the chapter stored on another server. Hope you enjoy it!

    – Chris Lee

Leave a Reply