Old saying: A scientist would rather use someone else’s toothbrush than another scientist’s nomenclature. This is true in most professions, we all have our own terminology, so tags inherently have a translation barrier. One person’s filing system may work for them but not for others.

I can hear the retorts now; Cooccurance analysis! Topic clustering! This gets back to traditional IR again and we really haven’t made much progress making search results more fuzzy. This does not provide the same affordances as ontologies, schemas and mappings, or even duck typing if you want to look at it through the object type system lens.

If you look at most of Google’s architecture, tools and systems from a high enough level you realize that it is all a designed around the concerns of latency (and expected failure, but this post is about latency). Latency in networks, in disk reads and writes, from main memory to CPU, you name it. A handy chart that keeps me awake at night:

Relative Data Access Latencies, Fastest to Slowest
CPU Registers (1)
L1 Cache (1-2)
L2 Cache (6-10)
Main memory (25-50)
—- don’t cross this line, don’t go off mother board! —-
Hard drive (1e7)
LAN (1e7-1e8)
Floppy, CD-ROM (1e9)
WAN (1e9-2e9)