Archive for June, 2008

Correlative Analytics♠

Once again, Kevin Kelly explains the intersection of computer science, mathematics, large datasets, and science in a way that few can. The link will take you to the entire post, but these juicy tidbits are here to tease:

There’s a dawning sense that extremely large databases of information, starting in the petabyte level, could change how we learn things. The traditional way of doing science entails constructing a hypothesis to match observed data or to solicit new data. Here’s a bunch of observations; what theory explains the data sufficiently so that we can predict the next observation?…

In a cover article in Wired this month Chris Anderson explores the idea that perhaps you could do science without having theories.

This is a world where massive amounts of data and applied mathematics replace every other tool that might be brought to bear. Out with every theory of human behavior, from linguistics to sociology. Forget taxonomy, ontology, and psychology. Who knows why people do what they do? The point is they do it, and we can track and measure it with unprecedented fidelity. With enough data, the numbers speak for themselves.

Petabytes allow us to say: “Correlation is enough.” We can stop looking for models. We can analyze the data without hypotheses about what it might show. We can throw the numbers into the biggest computing clusters the world has ever seen and let statistical algorithms find patterns where science cannot.

There may be something to this observation. Many sciences such as astronomy, physics, genomics, linguistics, and geology are generating extremely huge datasets and constant streams of data in the petabyte level today. They’ll be in the exabyte level in a decade. Using old fashioned “machine learning,” computers can extract patterns in this ocean of data that no human could ever possibly detect. These patterns are correlations. They may or may not be causative, but we can learn new things. Therefore they accomplish what science does, although not in the traditional manner…

My guess is that this emerging method will be one additional tool in the evolution of the scientific method. It will not replace any current methods (sorry, no end of science!) but will compliment established theory-driven science. Let’s call this data intensive approach to problem solving Correlative Analytics…

Perhaps understanding and answers are overrated. “The problem with computers,” Pablo Picasso is rumored to have said, “is that they only give you answers.”  These huge data-driven correlative systems will give us lots of answers — good answers — but that is all they will give us. That’s what the OneComputer does —  gives us good answers. In the coming world of cloud computing perfectly good answers will become a commodity. The real value of the rest of science then becomes asking good questions…

This is the clearest expression yet of what I think the Discovery Informatics degree at my school can offer to those interested in these emerging fields. And remember, where science leads, business opportunities follow closely behind. There is much to be done…………….

Swear To God

Friends, it is 10:55 PM as I write this. I am studying Spanish, specifically the third person direct object pronoun, and the various rules attendant. One of my homework questions (bear with me):

Using a mix of males and females, think of four well-known people that you either admire, detest, hate, or respect. Jot down their names, and then write how you feel about that person using the following verbs:  admirar  detestar  odiar  respetar

Modelos (example)

Barack Obama: Lo admiro porque es inteligente.

Paris HIlton: La detesto porque es tonta y egoista.

Did you get that? My Spanish textbook just used Barack Obama as an example. Today is June 22, 2008. This book, in its 3rd edition, was published in 2008.

Can I assume the authors of a Spanish textbook think enough of Obama to use him as an example in 2007?


The Nuclear Familia in Spanish

French, when I was taught it lo those many years ago, had a straight-forward vocabulary that described the family unit. The list of words was short, matched my American concepts, and was learned without any great strain.

Modern Spanish seems to reflect the new reality, and it is interesting to see how my book handles the changes.

A partial list of the English equivalents:





step-brother and sister

half-brother and sister

step-father and mother

single father and mother


Seems like my new language has kept up with the times……..

Mi Clase de Espanol

Posting has been light, likely due to the fact that I am trying to memorize vocabulary, verbs, and grammar, as well as trying to learn the pronunciation of words that often look just like English, but sound so very different. Given that I start in Spanish with no previous classtime, I am somewhat surprised by how much I have picked up in just two weeks of class. Whoever said that learning a language was best done by immersion was right, mostly. Every day, for one hour and 45 minutes, in class, and lots of study daily. It helps that our professor, who normally teaches much higher level Spanish courses, converses with each of us every day. He is building our ability to hear more complex Spanish, understand it, and respond accordingly. 25% of our grade will come from an oral evaluation at the end of the term. Even with that, quizzes and tests that include an oral portion which requires us to answer questions based on his words are still very tough for me.

Let me just note for you Spanish speakers that I am finding el peritite confusing, and I have developed a strong dislike for irregular verbs. Not until today did I figure out that a verb could be irregular if only one conjugation, in one tense, was different from the standard declension or spelling! Now that I have that clear, learning the patterns of spelling that produce irregular conjugations will be much easier.

Also, unlike the years of elementary school and high school French that I took, today’s curriculum introduces us to cases and grammar in a much different way. Now, it’s all about learning the rules for specific cases, not memorizing tables and applying them to lists of verbs. I’m not saying it’s more effective, but it does seem to work.

But, bottom line, I’m thankful to be in classes this summer that rely on memorization, hard work, and perserverance. During this mathematical sabbatical, I am rebuilding the confidence that I had begun to lose in the Spring. Abstraction distraction (if I may) was doing me in. This work, at some level, gives me the confidence to believe that I can handle Biology. If I can memorize, and work hard, I will do it.

Don’t think we can make the same claim about that third calculus class, though.

Hasta luego!


Well, we’re three days into the introductory course for Spanish. I’m taking the advice of one of my advisers, who counseled that I get the language requirement out of the way as soon as possible; for him, therefore me, that meant starting this summer. Given that the school only offers three courses at the introductory level in summer classes, I had to choose between Russian, German, and Spanish. What would you have done in my shoes? I thought so…..

Gone are the quixotic notions of learning some slightly romantic, or relatively obscure, language, being able to converse (?!) with about 2 other people in a 100 mile radius, or, more likely, having to bust my hump to “learn” some language while I am trying to take yet another unbelievably difficult math or computer science course. No Farsi, or Hindu, or Chinese, or even (sob) Latin, which still holds some small fascination for me after a brief exposure 45 years ago. I guess time does heal old wounds…..

No, for me, it’s just Spanish. Which, after these 3 days, looks like it’s going to be plenty challenging. I’ve already memorized about 50 words, and it appears that our professor expects the pace of memorization to pick up as we go along. It’s nose to the grindstone, again, and I’m okay with that.

At least for now, the challenge is not understanding abstract concepts and complex rules/principles. It’s just plain old hard memorization, and, frankly, I welcome the change.

This will be good practice for Biology in the Fall.

Hasta Pronto…..


Hola! Me llamos el estudiante!

“Life’s hard, son. It’s harder when you’re stupid.” — The Duke.

Education is a companion which no misfortune can depress, no crime can destroy, no enemy can alienate,no despotism can enslave. At home, a friend, abroad, an introduction, in solitude a solace and in society an ornament.It chastens vice, it guides virtue, it gives at once grace and government to genius. Without it, what is man? A splendid slave, a reasoning savage. - Joseph Addison
The term informavore (also spelled informivore) characterizes an organism that consumes information. It is meant to be a description of human behavior in modern information society, in comparison to omnivore, as a description of humans consuming food. George A. Miller [1] coined the term in 1983 as an analogy to how organisms survive by consuming negative entropy (as suggested by Erwin Schrödinger [2]). Miller states, "Just as the body survives by ingesting negative entropy, so the mind survives by ingesting information. In a very general sense, all higher organisms are informavores." - Wikipedia

Blog Stats

  • 30,244 hits