Joho the Blog » taxonomy

May 23, 2007

Salon’s “Miscellaneous” interview with me

Scott Rosenberg, co-founder of Salon and the author of Dreaming in Code, has posted at Salon an interview with me about Everything is Miscellaneous.

At his blog, Scott adds some “out-takes” from the interview, and recommends the book. [Tags: salon scott_rosenberg everything_is_miscellaneous folksonomy taxonomy tagging ]

Follow me

Categories: Uncategorized Tagged with: digital culture • everythingIsMiscellaneous • media • philosophy • taxonomy Date: May 23rd, 2007 dw

5 Comments »

Technorati focuses on tags ‘n’ topics

(Disclosure: I’m on Technorati’s board of advisors. I saw an advance version of the changes, but otherwise had no direct influence. Also, although at some point I conceivably could make some indeterminate amount of money from Technorati, the fact that Dave Sifry is a friend influences my judgment more.)

Technorati has just done a major re-shaping of itself, which is interesting as a response to the increasing need for both pinpoint accuracy and broad context. Dave Sifry, the ceo, blogs about it here.

Technorati is driving down both roads simultaneously, which I think makes sense. On the one hand, if you want to do an old fashioned text search through blogs, the site has improved its engine and pared down the experience. If, on the other hand, you want to see information in context (and on the Web that of course means being able to explore that context further), the site has taken several steps:

1. The default search now is for tags, not for text in blogs. Tags are expressions of what the readers think a post is about, so some types of searches should return more accurate, relevant and interesting results. Of course, we also use tags in idiosyncratic ways, so only experience will tell whether and when tag searching is more satisfying than text searching. In any case, Technorati lets you click to search through text, if that’s what you want. (You can go straight to the text search page via s.technorati.com.)

2. Technorati continues to include more sources and more types of information. In fact, the home page no longer positions Technorati as a blog search engine. “Include everything” is one of the key recommendations of Everything is Miscellaneous, so I like its continuing inclusiveness :)

3. These changes seem to move Technorati towards embracing topics as a basic unit of meaning. For example, if you search for “ron paul,” you are taken to a page that assembles blog posts, videos and photos about the controversial Republican. There are tabs for music and events as well, although in this case Technorati didn’t find any. There’s also a “WTF” post, an explanation of the topic generated and voted on by users. (It’s displaying the WTF by siegheilneocon, which only got 27 votes, instead of the one by beckychr007, which got 61 votes, seeming to prefer the most recent to the most popular, which is either a bug or I’m not understanding it.)

Topics are an important way to cluster ideas. At the moment, Technorati has no concept of a topic apart from a tag, however. The infrastructure to do more is in place, because the site already displays a list of related tags. The results pages don’t bring in the content from those tags, though. For example, if “john mccain” were a related tag, it might make sense to bring some of that tagged material into the “ron paul” topic page. That would give us a broader view of the topic. Conflating topics with tags can increase the precision of results — but not for highly ambiguous tags such as “shot” — but can also reduce the context and thus our understanding. Granted, figuring out algorithmically what’s relevant and how it’s relevant is no small challenge. (Maybe if some topic pages were marked as especially worthwhile and stable, not all of the clean up and construction would have to be done algorithmically.)

Likewise, at some point it’d be good to start relating topics, so that the system knows that “ron paul” is (in some sense) contained by “republicans” and republicans are related to “politics.” This sort of information can eventually be gleaned folksonomically from the tags. Of course there’d be nothing wrong with using existing taxonomies and ontologies to help further refine the relationships among topics. It’s always going to be a messy, overlapping, shifting mass of connections, but, well, so are we.

This is not a criticism of what Technorati has done. In fact, I mean it as a way of expressing my excitement about where it goes from here. [Tags: technorati folksonomy tagging search blogs everything_is_miscellaneous]

I just heard about TagAndFacet, a tool that lets you tag Web sites, Outlook messages, and Windows files for easy re-finding. It also lets you declare “facets” — metadata categories of continuing use — so you can do faceted, tree-like browsing. A version is available for free with a limit on how many items you can tag; a for-pay version should be available soon. (I haven’t yet tried it.)

Follow me

Categories: Uncategorized Tagged with: everythingIsMiscellaneous • media • taxonomy Date: May 23rd, 2007 dw

2 Comments »

May 22, 2007

The “Miscellaneous” Podcasts: Neil DeGrasse Tyson on the order of the universe

In the 4th in the Harvard-Wired “Miscellaneous” podcast series I get to interview Neil DeGrasse Tyson, the astrophysicist and author of Death by Black Hole. We talk about our culture’s insistence on thinking there is one preferred way of ordering the cosmos. [Tags: neil_degrasse_tyson berkman astrophysics pluto planets taxonomy everything_is_miscellaneous]

Follow me

Categories: Uncategorized Tagged with: taxonomy Date: May 22nd, 2007 dw

1 Comment »

May 14, 2007

Yahoo interview

Follow me

Categories: Uncategorized Tagged with: business • everythingIsMiscellaneous • media • taxonomy Date: May 14th, 2007 dw

1 Comment »

May 8, 2007

Amazon’s tag feeds

Amazon is beefing up it’s RSS-ing of tags. (More at EverythingIsMiscellaneous.com)

Follow me

Categories: Uncategorized Tagged with: everythingIsMiscellaneous • taxonomy Date: May 8th, 2007 dw

Be the first to comment »

April 27, 2007

Chris Lydon’s interview posted

Radio Open Source has posted the mp3 of yesterday’s show about everything being miscellaneous, with me, Karen Schneider, and Tim Spalding. Chris being Chris, he drives it more towards than the broad and philosophical than, well, anyone else on radio. And best of all, you can hear me get the name of the author of Moby-Dick wrong! [Tags: everything_is_miscellaneous radio_open_source christopher_lydon karen_schneider tim_spalding media taxonomy folksonomy]

Follow me

Categories: Uncategorized Tagged with: business • culture • digital culture • everythingIsMiscellaneous • media • philosophy • taxonomy Date: April 27th, 2007 dw

4 Comments »

April 3, 2007

Harvard Libaries Social Tagging Forum video is up

Harvard University Libraries held a workshop on social tagging and other such technologies last week. I blogged it here. Now the videos are up. Part I Part II. (I spoke in part I.) [Tags: tagging libraries folksonomy taxonomy everything_is_miscellaneous]

Follow me

Categories: Uncategorized Tagged with: education • everythingIsMiscellaneous • taxonomy Date: April 3rd, 2007 dw

3 Comments »

March 28, 2007

Harvard Forum on Social Tagging

I’m at [well, I was yesterday when I wrote this] a session at Harvard’s Lamoint Library (one of the 90+ libraries here) about Web 2.0 and social tagging. I just gave a 20 minute opener on why tagging matters.

Michael Hemment, the host, begins by showing tag clouds from 50 students who were asked to tag some particular resources. The group quickly guesses that the first tag cloud refers to the libraries, the next is Google, and the next is Jon Stewart. Very amusing,

Michael talks about why slocial tagging matters to libraries. He mentions some initiatives, including PennTags , Stanford IC, and the Steve Museum. Harvard has the CRT (Collaborative Research Tool) and EdTags initiatives. He also mentions iCommons (exploring iSites metadata and tagging) and ARTStor .

He takes a closer look at LibraryThing.com, showing how easy it is to enter titles, organize them, tag them, and get suggestions.

PennTags was created by the U Penn library to enable university members to tag books. (The site is open to anyone, but only U Penn members can add tags.) It begins with a tag cloud of tags used at least 58 times, Users can also create folders to organize bookmarks into projects. [I blogged about it here.]

The Stanford Library Information Center combines tags, blogs and wikis. It includes tagging by librarians who organize resources in a somewhat more orderly way.

Harvard could, Michael says, enable tagging of the libraries’ resources, and the Lib-X tool (a browser add-in that gives you access to Harvard’s onloine resources) could be used to tag sites, adding to what Harvard knows.

Carla Lillvik, Research and Distance Services Librarian at the Gutman Library of the Harvard Graduate School of Education, looks at “social tagging and bibliographic management.” She says you want not only to find resources and organize them, but also to cite them.

She uses as her example the site Five Weeks to a Social Library. She adds it to her page at Connotea and tags it. She could also post it to EdTags.org. But what about resources she finds in research databases, e.g. EBSCO Host? She could add it to Connotea, even though the URL doesn’t look persistent. But Connotea doesn’t pick up any of the bibliographic info from the database. (Connotea has agreements with a long list of such systems, including BioMed Central, PLoS, Nature.com, and arXiv.org, but not with all of them.) She can instead make a folder in EBSCO, which does indeed pick up all the info. [Sounds like we need a standard API for university e-research systems.] Harvard’s RefWorks has the advantage, Carla says, of enabling batch tagging [LibraryThing does too] and enables output in a variety of bibliographic styles [yay!] RefWorks folders can be shared, even with people who don’t have an account; they can be shared as an RSS feed, too. (RefWorks works with Google Scholar — you can set a preference so that results can be imported into RefWorks.)

Michael Hemment presents Prof. Dan Smail’s Collaborative Research Tool (CRT), a social tagging tool that works within Harvard’s e-environment. In Smail’s course on Medieval Europe (History 1122) , students are put onto teams (e.g., “France, Germany and italy”) and are assigned sources. They create virtual note cards that are tagged, annotated and entered nto a database. Class discussions, lectures, and final papers are based on these cards.

The cards tend to include the passage, comment, related links, and tags. It’s easy to navigate by tag.

Pedagogical implications, according to Michael: Students have to reflect on their tagging schemes. [meta learning] They cards “form the basis of complex intertextual discourse on a broad range of medieval topics.” E.g., you could see how Ulysses appears through multiple literatures. Also, tagging develops a personal relationship to the source material.

[Excellent. But we still need a way to write a document based on cards, so that adding info from the card automatically creates the right footnote and bibliographic entry in the document, and notes where the card has been used. I blogged about this here.]

Adam Seldow, a grad student at the Harvard School of Education, works on EDTags.org. It’s a social network to connect people who share interests in education. It’s open to anyone. You can tag a site, vote on bookmarks, email them, blog them, or find related blog postings. You can upload your papers, photos, presentations, etc.

Q: How does tagging fit with scholarly resource? Is there a way to cite where and how a resource is tagged?
A: (Michael) Not in the major tagging sites, e.g., del.icio.us. The lack of rules has been one of the advantages of these sites.The noise introduced can often be negated at least in part by the good rising to the top.

Q: How about privacy?
A: (Adam) EdTags lets you set the level of privacy. And it’s an actively managed site.

Q: What types of resources does EdTags tag?
A: (Adam) Mainly “gray literature” — blog posts, preprints, Web sites, course-generated papers.

Q: (me) What do we do about the fragmentation of the tagging space? I can tag in del.icio.us, Connotea, EdTags…
A: (Adam) A condition when we built EdTags was that it has to be able to talk wth del.icio.us or export to an XML file. Personally, I use different tagging sites for different types of research.

Q: What are the patterns of use at EdTags?
A: EdTags has been live for a little over a year. (It started as TeacherShare.) First year doctoral students, who were trained on it, use it. It’s being used in some specific courses and teacher education programs, plus a community of faculty members interested in emerging trends in education technology. The person who uploads the most bookmarks is a woman from Slovenia. There are about 400 users. About 100,000 hits/month.

Q: Did you build it from scratch?
A: It’s a mashup of Scuttle, an open source platform, with lots of custom work.

Q: HW and SW behind it? How did you finance it?
A: (Adam) A Harvard Provost Innovation Grant financed it.

Q: How to encourage the use of social tagging at a library?
A: (Michael) I don’t know that we want to encourage it. We’re exploring. [Tags: libraries tagging social_networks everything_is_miscellaneous folksonomy]

Follow me

Categories: Uncategorized Tagged with: digital culture • education • everythingIsMiscellaneous • taxonomy Date: March 28th, 2007 dw

8 Comments »

March 20, 2007

Ranganathan’s fantasy

From Ranganathan, the founder of library science:

“Since multiplicity of helpful order among specific subjects is a fact independent of library classification – a fact to be reckoned with in arrangement – how are we to provide for it? It is a case of arranging concrete materials – books and other kindred materials – in such a way that one kind of arrangement presents itself to one person and another kind to another person. To secure this by pressing a button is obviously possible only in the world of fancy; it is not possible in the world of reality.”

Ranganathan, Philosophy of Library Classification (1951)

Via Tim Spalding via Jacob Glenn [Tags: ranganathan taxonomy everything_is_miscellaneous ]

Follow me

Categories: Uncategorized Tagged with: everythingIsMiscellaneous • taxonomy Date: March 20th, 2007 dw

4 Comments »

March 13, 2007

[berkman] John Mayer: Legal education commons

John Mayer, the Exec Dir of the Center for computer-Assisted Legal Instruction (CALI), is giving a Berkman Tuesday lunch talk called “Subclassing the Commons.” CALI is 25 yrs old, incorporated by Harvard and the U of Minnesota Law Schools. 204 US law schools and 23 international law schools are members. So are more than 100,000 law students. CALI makes lessons available on line. This year, there will probably be a million lessons run. [John Palfrey has blogged the session here.]

He points out that there are sites that aggregate material put into the commons via Creative Commons licenses. But there’s not a lot there for law students. The commons by itself isn’t granular enough for communities of users, he says. People post on their blogs, “I’ve posted a paper at SSRN and would appreciate any comments,” or “I’m working on a project and was wondering if anyone else had,” or “Where can I find…?” John says, “If we aggregated all answers to those question across all institutions, would that be a commons, and would it have amazing value?”

“We’re best known for our lessons,” he says. He shows a flow chart of a question. Law professors throw out a question, he says, knowing the ways the students will get it wrong. If one gets it right, the prof branches differently. It’s a “pruned tree.” CALI’s authors write questions as a tree. There are about 600 lessons. Their model is to get 5 profs to write 5 lessons (25 mins each) over 8 months; the profs are paid.

He describes another project: Classcaster , a blog network using open source software. It’s built on top of PBX software (!). “With classcaster, you can make a phone call, you can leave an hour message. Then it instantly podcasts it.” But it was expensive paying for the phone call and the recording quality is crappy. Instead, they gave authors $1000 and a free digital recorder. There are now 60 faculty members doing podcasts that way. They’re available for free as part of the commons. As a result, “students started to tell us that they have this crappy evidence teacher so instead they listen to this other evidence teacher’s podcast.” And faculty noticed in listening to themselves that they’re skipping over some things, so it’s helped them improve. Other faculty learned teaching techniques by listening to others. On the other hand, in some courses (e.g., family law) it can suppress class participation.

Lessons are tagged according to a “topic grid,” based on how faculty describe their lessons, the “elevator pitch” of what a course is. CALI took a first cut at the taxonomy by looking at syllabi and then letting faculty refine it. They’re now going back and tagging the podcasts.

Another project is Access to Justice. CALI designed an interface that asks one question at a time (audibly asks) to help people find the right legal forms. It uses avatars because otherwise you get hung up on providing avatars of every race and gender, in a wheelchair or not, etc. Instead, it provides a non-racial — “blank” — male or female avatar. [Looks pretty white to me.] It shows the avatar on a path to a hall of justice. There are people in eight states working on the navigators for all the forms, but they reuse one another’s work because the forms are generally 90% the same in the states. One of the federal courts is interested in doing it and sharing it with the rest of the fed courts. (It’s all XML data and is written in Flash.)

ScholarshipPulse is in alpha. On the left it shows a paper. On the right is a comment system. It distinguishes comments as peers, professors or students. They’re experimenting with having the font size reflect one’s standing in the system. “I know we’re playing in ego space here.” But, John says, why not let people comment on their own blogs? Press a button and it’ll take a capture of the paper and your comment, and post it straight into your blog.

eLangdell.org (named after Langdell Hall at Harvard Law, or maybe after whoever Langdell Hall was named after, which I’m guessing is someone named Langdell) pools syllabi, cases, podcasts, etc. so you can dynamically create case books and other course materials. You can print out your own materials via lulu.com AALS, CLEA and Counseling Central do something similar, he says.

Q: Are you doing anything to help people who are not in law school?
A: At CALI’s LearnTheLaw.org lets you pay for access to the CALI lessons. [It’s $50/yr.]

Q: What’s your business model?
A: 200 schools pay us $5K year. For that they get everything we produce, but I’m trying to give away as much as possible. Not the lessons. If gave them away, the law schools would stop paying us. Everything else, just about, is open and free.

Q: (Charlie Nesson) MIT’s open courseware opens up syllabi. They’ve just started videoing classes — 21 of them. They’ve raised the question for us about whether there’s an opportunity for Harvard Law to step into the video YouTube space, recognizing the Law School’s mission as offering a legal education — not necessarily for credit — to the world. You’ve been at this for a long time Somehow there’s a relationship between the profit and non-profit. Suppose a company came to you…
A: We don’t need profit but we do need sustainability. The case book market is about $90M. Suppose you came in with uber casebooks that you could mix and match. We’d pay faculty to write those. That would put pressure on faculty to use the free PDF (or $18 lulu version) case book. A $90M market would become a $20M. That’s what eLangdell is.

There are hard problems doing this, he says. One is metadata. “People just drop stuff in.” They’re going to have to make the contributors do it. “Maybe we can hire students,” but for now they have to make it easy. In addition to the taxonomy, they’ll allow tags. Charlie points out that tagging might be the fastest way to get it done and usable. I mention freebase as a model for mixing a starter-set taxonomy, a mechanical Turk approach, and a wiki for metadata schema. John says that with a critical mass, it’ll get done.

Q: (Gene Koo) Charlie, you have a paper-based text book. Would you switch?
A: (Charlie) I’d love to. Unfortunately, my publisher owns the copyright.

A: It’s a Clayton Christensen innovator’s dilemma. We’ll pick off the low-hanging fruit. And, maybe retiring professors will donate their teaching materials into the commons as part of their “legacy.” [Tags: cali berkman everything_is_miscellaneous law education teaching commons creative_commons]

Follow me

Categories: Uncategorized Tagged with: digital culture • education • everythingIsMiscellaneous • taxonomy Date: March 13th, 2007 dw

2 Comments »

« Previous Page | Next Page »