Sunday, January 02, 2011

An Internet Archive Cleanup Day on 1/1/2012?

I love exploring the Internet Archive and building tools to help find media out there in the commons. As with any great collection or archive, there is a need for various kinds of cleanup tasks. Here's a few things that I'd like to help clean up:

* Missing files. Bad uploads. Missing transcodes..
* Missing (core) metadata.  I've even seen missing publication dates.
* Constructive reviews & ratings. This may also be good way to add metadata such as tags, shootlists, etc.
* Inappropriate contributions. There is lots of spam and other items that should be darkened.
* Transcodings. Many transcoded videos are missing audio.
* Duplicates. I have a few I could should remove.
* SEO. Promote what you love.

For some time, I've been thinking about trying to organize and promote an Internet Archive Cleanup Day.  Yesterday, Lucas Gonze pointed out that it was Public Domain Day...

I did not know that! It occurred to me that this would have been a great day for an IACD and shoved the idea onto the back burner.

Today, Lucas asked for ideas about what to do to celebrate the next PDD...

So let's but these idea back on the front-burner and start thinking about what kind of Public Domain Cleanup Day activities might make sense.

Of course, an obvious activity is contributing to the commons. See Lucas' post "license on my own music" for a discussion about how he licenses the PD music he covers.  Lucas organized a fun and thought provoking session at BarCampLA 2006 on best practices for covering PD works.  This could be the basis of PDD contribution activities.

I'll post some new search tools to help find media and collections of interest that can benefit from some cleanup help.

What other kinds of things might be good to try during a yearly PD cleanup effort?  How can we make it fun, or maybe even a game?


Blogger Krystian said...

Hi Markus,

Nice blog, I love the blogger layout and profile pages etc... seems more homely in a good way.

Think the clean-up's a good Idea, the amount of bad transcodes I've seen on has wilted my visits which is a shame as I love the idea behind the site.

5:54 PM  
Blogger Markus Sandy said...

Hi Krystian. I think a better experience is very possible with a little cleanup. Archive is using html5 now and seems to be favoring mpeg4 over flv's these days. If we can take a little time to identify ones that need recoding, it's actually quite easy to rerun them with better parameters).

7:56 PM  
Blogger Internet Archive said...

Hi Markus,

Jeff here at Internet Archive. We'd love to coordinate something with you. Can you shoot me an email at


Jeff Kaplan
Collections Manager

1:33 PM  
Blogger Markus Sandy said...

Hi Jeff,

Thanks for commenting on my blog post.

My hope was to start some open discussion, solicit ideas and then present a project plan to someone at your organization. So here you are already! :)

A little background:

Many years ago, I worked on the Ourmedia project and have some familiarity and passion for working with an Archive collection. Lately, I've been extending some of Tracey Jaquith's examples into some fun html5 search tools that I plan to blog about soon. As with any good search tool, one finds all kinds of great content, but also uncovers areas where a little TLC would be of great benefit. This got me to thinking about ways to get folks to participate in some kind of community 'spring cleaning' type event. I saw Lucas Gonze's tweet about Public Domain Day, which motivated my post and here we are, thanks to your comment.

I have specific ideas I'd be happy to share. I was thinking of doing that via blog so as to get feedback, but I'm open to any approach. I'm sure you folks have thought a lot more about content curation than I have and so I'd like to hear what you think would be helpful.

I like the idea of this being part of Public Domain Day, but am not wedded to that. I had envisioned a coordinated effort with Creative Commons regarding content licensing.

Bottom line: I'd love to help organize and promote an annual day that helps both raise awareness and maintain the quality of the Internet Archive.

I look forward to hearing your thoughts.

Markus Sandy
skype: msandy

6:26 PM  

Post a Comment

<< Home