Joho the Blog » everything_is

August 19, 2009

Dilbert goes miscellaneous

Amusing Dilbert today, for those who can’t resist a good taxonomy joke. (Thanks for the tip, Helena!)

[Tags: everything_is_miscellaneous comics dilbert humor taxonomy ]

1 Comment »

August 14, 2009

I know I’m not the only one who’s finding WolframAlpha sometimes frustrating because I can’t figure out the magic words to use to invoke the genii. To give just one example, I can’t figure out how to see the frequency of the surnames Kumar and Weinberger compared side-by-side in WolframAlpha’s signature fashion. It’s a small thing because “surname Kumar” and “surname Weinberger” will get you info about each individually. But over and over, I fail to guess the way WolframAlpha wants me to phrase the question.

Search engines are easier because they have already trained us how to talk to them. We know that we generally get the same results whether we use the stop words “when,” “the,” etc. and questions marks or not. We eventually learn that quoting a phrase searches for exactly that phrase. We may even learn that in many engines, putting a dash in front of a word excludes pages containing it from the results, or that we can do marvelous and magical things with prefaces that end in a colon site:, define:. We also learn the semantics of searching: If you want to find out the name of that guy who’s Ishmael’s friend in Moby-Dick, you’ll do best to include some words likely to be on the same page, so “‘What was the name of that guy in Moby-Dick who was the hero’s friend?'” is way worse than “Moby-Dick harpoonist’.” I have no idea what the curve of query sophistication looks like, but most of us have been trained to one degree or another by the search engines who are our masters and our betters.

In short, we’re being taught a pidgin language â€” a simplified language for communicating across cultures. In this case, the two cultures are human and computers. I only wish the pidgin were more uniform and useful. Google has enough dominance in the market that its syntax influences other search engines. Good! But we could use some help taking the next step, formulating more complex natural language queries in a pidgin that crosses application boundaries, and that isn’t designed for standard database queries.

Or does this already exist?

Tags: search pidgin nlp natural_language_processing google everything_is_miscellaneous

Follow me

Categories: Uncategorized Tagged with: everythingIsMiscellaneous • everything_is_miscellaneous • google • metadata • natural_language_processing • nlp • pidgin • search Date: August 14th, 2009 dw

4 Comments »

August 11, 2009

The universality of names

There’s a terrific article by Carol Kaesuk Yoon in the NY Times about research that shows that humans around the world tend to cluster the natural world in highly similar ways, even using similar-ish names.

[Tags: everything_is_miscellaneous taxonomy ]

Follow me

Categories: Uncategorized Tagged with: everythingIsMiscellaneous • everything_is_miscellaneous • folksonomy • taxonomy Date: August 11th, 2009 dw

1 Comment »

August 9, 2009

Twitterelevancy

With it’s new Fresh view, Delicious builds on the TweetNews idea of using links in Tweets (and other measures) as a way to find what’s newest and most interesting. As the blog post about it says:

Underneath the hood, Fresh factors several features into the ranking like related bookmark and tweet counts,Â “eats our own dogfood”Â by leveragingÂ BOSSÂ to filter for high quality results, as well as stitches tweets to related articles even if the tweets do not provide matching URLs (asÂ ~81% of tweetsÂ do not contain URLs). Try clicking the â€˜x Related Tweets’ link for any given story to see the Twitter conversation appear instantly inline.

It’s a welcome reslicing, not a whole new beast, but it seems useful.

[Tags: delivious everything_is_miscellaneous twitter news ]

Follow me

Categories: Uncategorized Tagged with: delivious • everythingIsMiscellaneous • everything_is_miscellaneous • metadata • news • social networks • tagging • twitter Date: August 9th, 2009 dw

1 Comment »

August 7, 2009

Tags again

Jeez, it would save me a lot of time if Keynote (or Powerpoint, if you insist) let me tag slides and objects in slides (especially images). I spend way too much time looking for that slide of a “smart room” or the one that shows business vs. end-user use of Web 2.0, or that photo of an old broadcast tower. (Later that day: Maybe I should add, having just rewritten the Wikipedia entry on Interleaf, that back in the early 1990s, Interleaf gave us exactly that capability.)

Instead, I have two hacks, both a pain in the butt. First, I keep a humungous file of slides I think I’ll want to use again. Second, I’ve started putting tags into the speaker notes by putting the tags in brackets. But I use the speaker notes to speak from, so larding them up with tags is sub-optimal.

And especially if you save Keynote files in the pre-2009 multi-file formats, then it’d be a snap for third parties to build tools that extract the tags and manage them. (I have a fussy home-made utility that extracts the text from the speaker notes and builds an editable file of them. If you want it, let me know.)

Tags are easy! Tags are useful! Let tags be tags!

[Tags: tags everything_is_miscellaneous keynote powerpoint metadata whines ]

Follow me

Categories: Uncategorized Tagged with: everythingIsMiscellaneous • everything_is_miscellaneous • keynote • metadata • powerpoint • tagging • tags • whines Date: August 7th, 2009 dw

2 Comments »

July 26, 2009

The Guardian on miscellaneous bookshelves

The Guardian has fun article on schemes for arranging the books on your shelf, with an interesting set of comments. (It makes me want to send the entire thread a copy of Everything Is Miscellaneous.)

[Tags: everything_is_miscellaneous dewey the_guardian ]

Follow me

Categories: Uncategorized Tagged with: dewey • everythingIsMiscellaneous • everything_is_miscellaneous • libraries • taxonomy • the_guardian Date: July 26th, 2009 dw

Be the first to comment »

July 25, 2009

AP to digitally monitor copyright

The AP has announced it is going to use an automated system to monitor the use of AP content on the Web, looking for copyright violations. The empire is fighting back. From the press release:

The Associated Press Board of Directors today directed The Associated Press to create a news registry that will tag and track all AP content online to assure compliance with terms of use. The system will register key identifying information about each piece of content that AP distributes as well as the terms of use of that content, and employ a built-in beacon to notify AP about how the content is used.

I think there are three possible broad-stroke outcomes:

1. The AP takes an enlightened and generous view of copyright protection and its terms of use, encouraging people to link to and cite its stories, and saving its angry face for commercial thieves, wholesale infringers, and other scum. The AP remains a major source of news, fulfills the social mission of the newspapers who are its members, and our culture is better off for it.

2. The AP’s automated system is set on a hair trigger. The AP protects its copyright so well that no one ever hears from it again.

3. The AP acts inconsistently. It sends scary letters to teenagers who copy three paragraphs about the Jonas Brothers and sics lawyers on a professor teaching a course on media studies. No one understands what the AP is doing, so we all get scared and hate it.

To start with, it’d be great if the AP’s copyright warnings didn’t just tell people what they can’t do, but also told them what they can do, and encouraged us to re-use the material as much as possible. On the other hand, since one of the aims of the new system (according to the press release) is to facilitate the use of pay walls, I expect we’ll see more of the AP’s content making itself irrelevant.

[Tags: ap media journalism free copyright copyleft everything_is_miscellaneous ]

Follow me

Categories: misc Tagged with: ap • copyleft • copyright • everything_is_miscellaneous • free • journalism • media • misc Date: July 25th, 2009 dw

4 Comments »

July 22, 2009

My PDF talk on facts ‘n’ transparency

Link. (The video embeds my slides, but (1) they get more and more out of order in this YouTube; they were in the right order when I actually presented them. 2. My font got lost somewhere in the translations, and so there’s a fair bit of mis-sizing, text overflows, etc.) (I posted about one of the ideas in the talk (transparency as the new objectivity) here.)

[Tags: pdf09 transparency media politics e-democracy e-government e-gov everything_is_miscellaneous newspapers media ]

Follow me

Categories: misc Tagged with: e-democracy • e-gov • e-government • everything_is_miscellaneous • media • misc • newspapers • pdf09 • politics • transparency Date: July 22nd, 2009 dw

1 Comment »

July 18, 2009

When there’s no such thing as the best

I posted my post about the Sotomayor hearings over at Huffington, where I got a grand total of two comments. The second one raised an interesting point. (The first one was funny.)

Or, “Senator, would you simply prefer that the Court be comprised of the best legal minds in the nation, regardless or their race, creed, or color, despite the fact that such a concept is foreign to the race conscious liberals among us?” – Parducci

That’s a reasonable response (leaving out everything after the “despite”), but I think it’s fundamentally wrong, since it assumes there is a way to rank order legal minds. There isn’t, because there is no such order.

Look at the current Justices. You may be able to say that one particular Justice’s “legal mind” is not as good as the rest (“Judge So-and-So just isn’t up to snuff”), but there isn’t any real way to rank them in order (except perhaps by ow well their decisions accord with political sides). With heart surgeons, maybe you can look at the survival rates of their patients â€” and there are problems with that â€” but for judges, there aren’t criteria that result in a reliable, accurate, and agreed-upon quantitative ranking. Likewise, who would think there’s any sense in trying to numerically rank philosophers, historians, or chefs? You can see that a particular one isn’t in the top rank or is out of her league, but within that top rank, there isn’t a numeric ordering.

So, for nominees to the Supreme Court, the idea that we should take “the best legal minds” actually means that we should choose from among those who are highly qualified for the job. Since that class is far larger than nine, we get to choose our Justices based on many considerations, including the likely effect they’ll have on the political balance of the court and â€” yes â€” the likely effect they’ll have by bringing a diversity of experience and outlook. For the wisdom of a group is enhanced by including difference within it.

In fact, it would be interesting to see how the degree of qualification (based on whatever criteria one wants to suggest) going into the Court matches with the performance of the Justice over the course of her or his term.

[Tags: sotomayor diversity everything_is_miscellaneous philosophy ]

Follow me

Categories: misc Tagged with: diversity • everything_is_miscellaneous • misc • philosophy • sotomayor Date: July 18th, 2009 dw

6 Comments »

July 11, 2009

Reslicing publications

The OCLC has an experimental site up that provides classification information for books and pubs. You type in the book’s title and author (or ISBN number, or other such ID), and it returns info about the various editions and how they’re classified in the OCLC’s Dewey Decimal Classification System or by the Library of Congress. You can then see the other books that share its Dewey Decimal number (for example, here’s Everything Is Miscellaneous, #303.4833>>Social sciences>>Social sciences, sociology & anthropology>>Social processes), at the OCLC’s useful Dewey Browser. Alas, when you click on the Library of Congress number, you get taken to a demand by the LC that you subscribe to Classification Web, instead of to the free LC Catalog (where my Misc book is listed like this).

Lots of metadata about the metadata…Gotta love it!

[Tags: everything_is_miscellaneous dewey_decimal oclc libraries books metadata ]

Follow me

Categories: Uncategorized Tagged with: books • dewey_decimal • everythingIsMiscellaneous • everything_is_miscellaneous • libraries • metadata • oclc • taxonomy Date: July 11th, 2009 dw

4 Comments »