The Ishmael Gradsdovic Papers, part forty-seven


Date: Fri, 1 Jul 94 16:14:11 PST
From: Ishmael Gradsdovic
To: boss
Subject: Knowledge Discovery in Databases
Message-Id: <9407012314.AA26787@slo.ludtech.com>

One thing that struck me about the systems I read about was that none of them were genuinely unsupervised in their learning. Each required someone with knowledge of the data to prune, present, explain, or modify the database in some way before the discovery system could make any sense of it.

My first bicycle was a green hand-me-down from a cousin in the valley, a "Go Navy" bumpersticker on the chain guard, rattling over eucalyptus leaves, my father in flannel shirt and blue jeans and dark hair and a beard works on the Coleman, the smell of the ocean and Vicks vapor rub. Squirrels so tame that if I hold very still they'll eat peanuts out of my hands.

I can't get to sleep, there are red and orange blobs floating in the darkness of my room, my nose is blocked but I smell the vapors rising from the hot skin of my chest and hear the vaporizer boiling and hissing, and I turn over the pillow again to put my head down on the cold side.

In some systems, the collection of possible data fields has to be trimmed in advance to the most promising set based on human-supplied heuristics. In others, worthless correlations (100% of mothers are women) must be identified and discarded. In others, continuous values must be mapped onto discrete values or vice-versa.

Los Osos means "the bears," my dad says, and Montana De Oro "mountain of gold," I take the golden mountain on faith but ask, "Dad, are there bears here?" The big green canvas tent next to the station wagon, the spiral shells in the swampy meander herons flopping off trailing dangly legs smooth sand perforated by retreating crabs ducking into sand cliff caves bubbling clicking mouths.

My room is a lava lamp of red and orange blobs moving around to the sound of percolation, the pillow is hot again my hair is wet. Close my eyes and the blobs float in the void. I can't see over the counter but I want a cheeseburger and on my tip-toes I can reach the straw dispenser. An electric shock shoots through my body, I wake up screaming and pointing at the ceiling, "Mommy, the spiders are going to eat me."

In no case is the system able to take a simple initial directive (e.g. find other fields in the database which alone or in combination correlate positively with field X) and with no other assistance find important, interesting, and expressable new information.

There's a baby bird chirping at the base of a tree; it's fallen out of the nest. I pick it up and take it to the car. My dad is packing up. I put the bird in the back of the car and watch it stumble around. We can keep it, right Daddy? Just until it grows up? I lean over the stove and tent and lantern and food box and plead to my dad in the front seat until he relents and then I jump out of the car and topple the coleman stove crushing the chick and killing it instantly.

"Mommy, don't GO" the pillow's hot she's just going to get a wet washcloth for my forehead it's too dark to see the spiders through the blobs but I know they're there. She puts the washcloth on my forehead and eyes and everything gets cooler, even the blobs are less warm, and I'm so tired.

The trick is not just to come up with information, but to come up with new informaiton that a) can be put into words, b) does not duplicate similar discoveries, c) does not just state the obvious, d) does not make distinctions which mask other correlations or which rely on insufficient data, and e) can be used to make predictions about future data.

"Fly!" and for the first half of the parabola there almost was cooperation from nature sufficiently supernatural. And Daddy's blue nylon jacket with the cold cold zipper and tired tears like the ocean and no bears.



email Ishmael