Apache Tika 0.6 - fresh meat!

January 31st, 2010

Check out Apache Tika 0.6. It's hot off the presses and contains even more goodies for mime type detection and content analysis than before. And it's got 30% more Apache goodness than the leading competitors. Feel free to head over here and pick it up!

NASA and Apache: a match made in heaven!

January 22nd, 2010

It's official. NASA has its first sorta official Apache project!

At this stage, Apache's official policy is to not do official PR announcing the project (we are kind of at step 2 in the process, out of 3). So what could be less official than my crappy blog that no one reads! I should be able to lord on here for ages about it then and still be within the policy! ;)

Anyhoo we have a ways to go before graduation, having just had a formal vote and been accepted into the Incubator, but it couldn't hurt for me to dream, right?

Spatial SOLR

December 31st, 2009

So, lately, I've been pretty involved in Spatial SOLR. I'm just scratching the surface of trying to understand this stuff. There was an interesting plugin though posted recently by Mat Brown that looks really clean and easy to understand. I'm going to give it a go and see how it churns out on my oceans dataset.

Apache Tika 0.5 out the door

December 31st, 2009

I announced a while back that Tika 0.5 is available for downloading. Get it while it's hot. Notable changes include moving to a source only release this time, improved RDF and OWL parsing and detection and other speedups.

Apache Tika 0.4 released - get it while it's hot

August 16th, 2009

I had the privilege of being the release manager for yet another Apache Tika release - 0.4 is out the door and has a number of major improvements over prior releases, including a major refactoring, separating out Tika's core components, from its parsers, from its external interfaces.

You can read the release announcement here.

You can grab Tika here.

Apache Tika 0.3 out the door

March 20th, 2009

Apache Tika, a sub-project of Apache Lucene, and a toolkit for content analysis and detection, has just made its 0.3 release.

You can grab the release from a nearby mirror here.

2009 Rose Bowl Victory: USC

January 1st, 2009

I must admit. I originally was a bit disappointed that USC wouldn't be playing in the national championship game. I mean, if not for a shitty first half, the worst possible first half they could have, at Oregon State in the 3rd game of the season, USC is playing Oklahoma for the national title. And we all remember what happened the last time that game was played.

However, I should say, I'm really, I mean really enjoying winning the Rose Bowl every year. USC has become razor sharp at it. In many ways, it's become our house, and not the one belonging to that other team in Westwood.

CONGRATS, Trojans!

Apache Tika 0.2 released!

December 10th, 2008

Apache Tika 0.2 has recently been released!. Thanks to Dave Meike for leading the charge.

You can grab Tika 0.2 here. Of note is that Tika recently graduated out of the Incubator and is now a full fledged sub project of Apache Lucene.

w00t!

Latest I just started a blog update

December 4th, 2008

...comes from Dave Woollard, who wins the award for interesting blog name, Macgyver Was Here.

Welcome to the blogsphere, Mr. Woollard!

Welcome back, blogger

November 26th, 2008

After what seemed like years (yet what was probably months), the blog, and Pagemewhen environment, is back online.

Interestingly enough, after total disaster with the old system (baron died, and I lost ~60 GB data, most of which I will never be able to get back), I was able to reconstruct about 95% of my original blog by visiting the Way Back Machine, at the Internet Archive. Thanks, guys, for the memories.

And, welcome back Potter.