logo
EverydayChaos
Everyday Chaos
Too Big to Know
Too Big to Know
Cluetrain 10th Anniversary edition
Cluetrain 10th Anniversary
Everything Is Miscellaneous
Everything Is Miscellaneous
Small Pieces cover
Small Pieces Loosely Joined
Cluetrain cover
Cluetrain Manifesto
My face
Speaker info
Who am I? (Blog Disclosure Form) Copy this link as RSS address Atom Feed

February 26, 2011

Has HarperCollins lost its mind or its soul?

HarperCollins has changed its agreement with the main distributor of e-books to libraries: e-books will now become inaccessible after 26 checkouts.

I understand publishers’ desire to limit ebook access so that selling one copy doesn’t serve the needs of the entire world. But think about what this particular DRM bomb does to libraries, one of the longest continuous institutions of civilization. Libraries exist not just to lend books but to guarantee their continuous availability throughout changes in culture and fashion. This new licensing scheme prevents libraries from accomplishing this essential mission.

It’s beyond ironic. Until now, libraries have in fact had to scale back on that mission because there isn’t enough space for all the physical books they’ve acquired over the years. So, they get rid of books that have fallen out of fashion or no longer seem important enough. Now that the digital revolution has so lowered the cost of storage that libraries can at last do far better at this culture-building mission, a major publisher has instituted the nightmare culture-killing license.

So, why do I say that HarperCollins has lost its soul instead of just criticizing it for this action? Because I don’t see how this scheme could make sense to a publisher unless the publisher had given up on books as a primary way we build a culture together. If you cared about books as vehicles of ideas and not just vehicles of commerce, you would have dismissed with contempt an idea that treats them as evanescent as chatter on a call-in show.

Tweet
Follow me

Categories: copyright, culture, libraries, too big to know Tagged with: copyright • drm • libraries Date: February 26th, 2011 dw

7 Comments »

February 22, 2011

[2b2k] Has the Internet killed our theory of medias effect on ideas and culture?

Heres a paragraph from the draft of the book Ive been working on. Its a draft, so contents are subject to settling during shipping.

…as revolution spread from Tunisia to Egypt at the start of 2011, a controversy arose about how much credit social media such as Facebook and Twitter ought to get. Malcolm Gladwell, the author of The Tipping Point, had written a New Yorker article in October 2010 arguing that social media are over-rated as tools of social change because they only enable “weak ties” among people, instead of the “strong ties” activists need in order to put themselves at risk. When some media and bloggers credited social media in the Mideast revolutions of 2011, Gladwell posted a two hundred word essay asserting that the influence of social media was “the least interesting fact.” Gladwells comments were a corrective to those who carelessly referred to the events as “Facebook revolutions” or “Twitter revolutions” as if they were the sole cause, but he also disputed those who thought social media played a significant role at all. Given Gladwells standing, and the fact that The Tipping Point is about the importance of social networks, his position surprised many. But, my point is not that Gladwell is mistaken although I think he is. Its that even if we do accept that social media played a role of some significance, its not at all clear what role they played. The more one looks at the question, the clearer it becomes that we dont even have an agreed-upon explanatory framework within which the question might be resolved. And this is true not only of questions touching the Internet. For example, a couple of months after the New Yorker ran the original Gladwell piece, it published an article by Louis Menand that wondered how to gauge the social and political effects of books such as Betty Friedans 1963 The Feminine Mystique. We look at social media at work in civil unrest and we wonder how much the media shape us? How does it happen? Does media influence have the same effects on all cultures? On all strata of society? How much of social unrest in general and in particular countries comes about as the result of having access to information? How much is the result of communication? Of sociality? If there were no social media, would the revolutions have happened, and, if so, how might they have been different?

As the Menand piece makes clear in its discussion of the effect of The Feminine Mystique, Silent Spring, and Unsafe at Any Speed, we used to think we knew at least part of how media influence ideas and policy. You write an important book, you go on Dick Cavett and Firing Line, and it changes minds and brings about changes. How? Well, um, it altered “the way we think about things” or some such phrase. We had a lot invested in the power of books.

Now, that theory seems not just hopelessly over-simplified, but wrong. I dont know if thats because single cultural items no longer have the impact that they once did, or if they never did but now we can see how influence actually spreads by following links and through up-and-coming tools such as the Berkman Centers MediaCloud. Or both. Or neither.

Tweet
Follow me

Categories: culture, libraries, too big to know Tagged with: 2b2k • books • communications • media Date: February 22nd, 2011 dw

2 Comments »

February 11, 2011

Libary Lab funds library innovation projects

Harvard’s new Library Lab has announced the first projects it will be funding. It’s an exciting set of projects.

The Library Innovation Lab [blog] that Kim Dulin and I co-direct had a few of its proposals accepted:

  • Library Analytics Toolkit: Tools to enable libraries to understand, analyze, and visualize the patterns of activities, including checkouts, returns, and recent acquisitions, and to do so across multiple libraries.

  • LibraryCloud Server: Build and maintain a web server that makes available to all Harvard library innovators data and metadata gathered from the Harvard libraries.

  • Library Innovation Podcasts: A series of biweekly podcast interviews with library innovators about their projects and ideas. The initial series would consist of 15 podcasts of about 20 minutes each.

We’re very excited about these, and have already begun work on them. I have a particular fondness for the LibraryCloud server, and will have more to say about it over time, for we view it ultimately as a multi-library system.

And a word of clarification. The Library Lab is part of Harvard’s Office for Scholarly Communication. It was created last year to create an infrastructure for library innovation at the school. The Harvard Library Innovation Lab at the Harvard Law School Library (which is why we tend to call it LIL) is a small lab that creates applications and prototypes that try to show how libraries can bring more of their value online. If it’s a cool idea for libraries, we’ll try to build it.

Tweet
Follow me

Categories: libraries Tagged with: libraries Date: February 11th, 2011 dw

12 Comments »

January 30, 2011

The more things (books) change…

Not many years ago ex-Governor Alfred E. Smith complained that his autobiography, Up to Now, was not being promoted vigorously enough. “But, Governor,” remonstrated his publishers, “we planted your book in every bookstore in the country.” “Bookstores,” snorted the Governor, unconsciously summing up every publisher’s grievance for the past five generations. “Who in hell goes to bookstores?”

From Try and Stop Me, by Bennett Cerf, 1944. (Simon & Schuster, p. 107)

Tweet
Follow me

Categories: libraries Tagged with: books • libraries Date: January 30th, 2011 dw

6 Comments »

January 29, 2011

The more things (like books) change…

Not many years ago ex-Governor Alfred E. Smith complained that his autobiography, Up to Now, was not being promoted vigorously enough. “But, Governor,” remonstrated his publishers, “we planted your book in every bookstore in the country.” “Bookstores,” snorted the Governor, unconsciously summing up every publisher’s grievance for the past five generations. “Who in hell goes to bookstores?”

From Try and Stop Me, by Bennett Cerf, 1944. (Simon & Schuster, p. 107)

Tweet
Follow me

Categories: humor, libraries Tagged with: books • humor • libraries Date: January 29th, 2011 dw

1 Comment »

January 26, 2011

McLuhan in his own voice

As a gift on the centenary of Marshall McLuhan’s birth, a site has gone up with videos of him explaining his famous sayings. Some of them still have my scratching my head, but other clips are just, well, startling. For example, this description of the future of books is from 1966.

Tweet
Follow me

Categories: libraries, media Tagged with: books • library • mcluhan Date: January 26th, 2011 dw

2 Comments »

January 20, 2011

If you laid out all the shelves in Harvard’s libraries…

Mainly because I wanted to futz around with the Google Maps API, I’ve created a mashup that pretends to lay out all the shelves in Harvard’s 73 libraries on a map.

Screen capture of map
Click to go to the page

You can choose your starting point — it defaults to Widener Library at Harvard — and choose whether you’d like to see a line of markers or concentric circles. It then pretends to map the shelves according to how many books there are in each subject.

Here’s where the pretending comes in. First, I have assumed that each book in the 12,000,000 volume collection is one inch thick. Second, I have used the Dewey Decimal system’s ten subject areas, even though Harvard doesn’t use Dewey. Third, I used an almost entirely arbitrary method to figure out how many books are in each subject: I did keyword subject searches. Sometimes, when the totals seemed way too low, I added in searches on sub-headings in Dewey. At the end, the total was probably closer to 4 million, which means my methodology was 300% unreliable. (Note: Math was never my strong suit.)

So, the actual data is bogus. For me, learning how to use the API was the real point. If you happen to have actual data for your local library, you can download the page and just plug them into the array at the beginning of the page. (All the code is in the .html file.)

Tweet
Follow me

Categories: libraries, tech Tagged with: google maps • libraries • maps • programming Date: January 20th, 2011 dw

4 Comments »

January 15, 2011

Patrons empty library to prove its value

BoingBoing.net reports:

The library in Stony Stratford near Milton Keynes, England, urged its patrons to check out every book on the shelves as a way of proving to the local council that its collection and facilities provide a vital service to the community. Stony Stratford is one of many towns across the UK that are facing severe library closures as the Tory-LibDem coalition government recklessly slashes its transfer payments to local governments (while breaking their promise to rein in enormous bonuses at the banks, even the ones that are owned by the taxpayer).

Let’s just hope the local government doesn’t look around the emptied library and think, “Yeah, great, I can really see how the new town road repair tool shed could fit in that corner labeled ‘Classics,’ and we could put the new town golf course’s pro shop over there by where the empty ‘Science’ shelves are…”

(Cross-posted at the Harvard Library Innovation Lab blog.)

Tweet
Follow me

Categories: libraries Tagged with: libraries Date: January 15th, 2011 dw

1 Comment »

December 31, 2010

Happy new year, libraries!

May 2011 be the best year for libraries in a couple of millennia!

So much is going on that it could be, you know. (And how often do you get to say that?) S

Tweet
Follow me

Categories: libraries, too big to know Tagged with: libraries Date: December 31st, 2010 dw

Be the first to comment »

November 30, 2010

[bigdata] Ensuring Future Access to History

Brewster Kahle, Victoria Stodden, and Richard Cox are on a panel, chaired by the National Archive’s Director of Litigation Jason Baron. The conference is being put on by Princeton’s Center Internet for Technology Policy.

NOTE: Live-blogging. Getting things wrong. Missing points. Omitting key information. Introducing artificial choppiness. Over-emphasizing small matters. Paraphrasing badly. Not running a spellpchecker. Mangling other people’s ideas and words. You are warned, people.

Brewster goes first. He’s going to talk about “public policy in the age of digital reproduction.” “We are in a jam,” he says, because of how we have viewed our world as our tech has change. Brewster founded the Internet Archive, a non-profit library. The aim is to make freely accessible everything ever published, from the Sumerian texts on. “Everyone everywhere ought to have access to it” — that’s a challenge worthy of our generation, he says.

He says the time is ripe for this. The Internet is becoming ubiquitous. If there aren’t laptops, there are Internet cafes. And there a mobiles. Plus, storage is getting cheaper and smaller. You can record “100 channel years” of HD TV in a petabyte for about $200,000, and store it in a small cabinet. For about $1,200, you could store all of the text in the Library of Congress. Google’s copy of the WWW is about a petabyte. The WayBack machine uses 3 petabytes, and has about 150 billion pages. It’s used by 1.5M/day. A small organization, like the Internet Archive, can take this task on.

This archive is dynamic, he says. The average Web page has 15 links. The average Web page changes every 100 days.

There are downsides to the archive. E.g., the WayBack Machine gets used to enable lawsuits. We don’t want people to pull out of the public sphere. “Get archived, go to jail,” is not a useful headline. Brewster says that they once got an FBI letter asking for info, which they successfully fought (via the EFF). The Archive gets lots of lawyer letters. They get about 50 requests per week to have material taken out of the Archive. Rarely do people ask for other people’s stuff to be taken down. Once, the Scientologists wanted some copyright-infringing material taken down from someone else’s archived site; the Archive finally agreed to this. The Archive held a conference and came up with Oakland Archive Policy for issues such as these.

Brewster points out that John Postel’s taxonomy is sticking: .com, .org, .gov, .edu, .mil … Perhaps we need separate policies for each of these, he says. And how do we take policy ideas and make them effective? E.g., if you put up a robots.txt exclusion, you will nevertheless get spidered by lots of people.

“We can build the Library of Alexandria,” he concludes, “but it might be problematic.”

Q: I’ve heard people say they don’t need to archive their sites because you will.
A: Please archive your own. More copies make us safe.

Q: What do you think about the Right to Oblivion movement that says that some types of content we want to self-destruct on some schedule, e.g. Facebook.
A: I have no idea. It’s really tough. Personal info is so damn useful. I wish we could keep our computers from being used against us in court; if we defined the 5th amendment so that who “we” are included our computers…


Richard Cox says if you gold, you know about info overload. It used to be that you had one choice of golf ball, Top-Flite. Now they have twenty varieties.

Archives are full of stories waiting to be told, he says. “When I think about Big Data…most archivists would think we’re talking about being science, corporate world, and government.” Most archivists work in small cultural, public institutions. Richard is going to talk about the shifting role of archivists.

As early as the 1940s, archivists were talking about machine-readable records. The debates and experiments have been going on for many decades. One early approach was to declare that electronic records were not archives, because the archives couldn’t deal with them. (Archivists and records managers have always been at odds, he says, because RM is about retention schedules, i.e., deleting records.) Over time, archivists came up to speed. By 2000, some were dealing with electronic records. In 2010, many do, but many do not. There is a continuing debate. Archivists have spent too long debating among themselves when they need to be talking with others. But, “archivists tend not to be outgoing folks.” (Archivists have had issues with the National Archives because their methods don’t “scale down.”)

There are many projects these days. E.g., we now have citizen archivists who maintain their own archives and who may contribute to public archives. Who are today’s archivists? Archival educators are redefining the role. Richard believes archives will continue, but the profession may not. He recommends reading the Clair report [I couldn’t get the name or the spelling, and can’t find it on Google :( ] on audio-visual archives. “I read it and I wept.” It says that we need people who understand the analog systems so that they can be preserved, but there’s no funding.


Victoria Stodden’s talk gloomy title is “The Coming Dark Ages in Scientific Knowledge.”

She begins by pointing to the pervasive use of computers and computational methods in the sciences, and even in the humanities and law schools. E.g., Northwestern is looking at the word counts in Shakespearean works. It’s changing the type of scientific analysis we’re doing. We can do very complicated simulations that give us a new way of understanding our world. E.g., we do simulations of math proofs, quite different from the traditional deductive processes.

This means what we’re doing as scientists is being stored in script, codes, data, etc. But science only is science when it’s communicated. If the data and scripts are not shared, the results are not reproducible. We need to act as scientists to make sure that this data etc. are shared. How do we communicate results based on enormous data sets? We have to give access to those data sets. And what happens when those data sets change (corrected or updated)? What happens to results based on the earlier sets? We need to preserve the prior versions of the data. How do we version it? How do we share it? How do we share it? E.g., There’s an experiment at NSF: All proposals have to include a data management plan. The funders and journals have a strong role to play here.

Sharing scientific knowledge is harder than it sounds, but is vital. E.g., a recent study showed that a cancer therapy will be particular effective based on individual genomes. But, it was extremely hard to trace back the data and code used to get this answer. Victoria notes that peer reviewers do not check the data and algorithms.

Why a dark age? Because “without reproducibility, knowledge cannot be recreated or understood.” we need ways and processes of sharing. Without this, we only have scientists making proclamations.

She gives some recommendations: (1) Assessment of the expense of data/code archiving. (2) Enforcement of funding agency guidelines. (3) Publication requirements. (4) Standards for scientific tools. (5) Versioning as a scientific principal. (6) Licensing to realign scientific intellectual property with longstanding scientific norms (Reproducible Research Standard). [verbatim from her slide] Victoria stresses the need to get past the hurdles copyright puts in the way.

Q: Are you a pessimist?
A: I’m an optimist. The scientific community is aware of these issues and is addressing them.

Q: Do we need an IRS for the peer review process?
A: Even just the possibility that someone could look at your code and data is enough to make scientists very aware of what they’re doing. I don’t advocate code checking as part of peer review because it takes too long. Instead, throw your paper out into the public while it’s still being reviewed and let other scientists have at it.

Q: [rick] Every age has lost more info than it has preserved. This is not a new problem. Every archivist from the beginning of time has had to cope with this.


Jason Baron of the National Archives (who is not speaking officially) points to the volume of data the National Archives (NARA) has to deal with. E.g., in 2001 32 million emails were transferred to NARA; in 2009, 250+ million archives were. He predicts there will be a billion presidential emails by 2017 held at NARA. The first lawsuit over email was filed in 1989 (email=PROFS). Right now, the official policy of 300 govt agencies is to print email out for archiving. We can no longer deal with the info flow with manual processes. Processing of printed pages occurs when there’s a lawsuit or a a FOIA request. Jason is pushing on the value of search as a way of encouraging systematic intake of digital records. He dreams of search algorithms that retrieve all relevant materials. There are clustering algorithms emerging within law that hold hope. He also wants to retrieve docs other than via key words. Visual analytics can help.

There are three languages we need: Legal, Records Management, and IT. How do we make the old ways work in the new? We need both new filtering techniques, but also traditional notions of appraisal. “The neutral archivist may serve as an unbiased resource for the filtering of information in an increasingly partisan (untrustworthy) world” [from the slide].

Tweet
Follow me

Categories: libraries, science, too big to know Tagged with: 2b2k • archives • bigdata • libraries • science Date: November 30th, 2010 dw

3 Comments »

« Previous Page | Next Page »


Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License.
TL;DR: Share this post freely, but attribute it to me (name (David Weinberger) and link to it), and don't use it commercially without my permission.

Joho the Blog uses WordPress blogging software.
Thank you, WordPress!