Eclipse Ecosystem

A blog devoted to promoting the Eclipse ecosystem

Friday, May 12, 2006

Random thoughts on searching for Eclipse

This blog has been in my draft list for a long time, and the release of Google Trends made me come back and take another run at it. I had a feeling that there would be a flood of blogs with all kinds of neato data this week related to Google Trends, and sure enough there has been a boat load. I've found the data to be a bit hit an miss though, especially when you start to zooming in by year, and comparing data to aggregated lists and other sources. Moreover the lack of absolute terms is a bit disconcerting too - did it grow from 10 to 100 or 10,000 to 100,000? I do appreciate the challenge in this space as I have been poking around with some "meme miner" tools over the past few months.

But I digress, and so is most of this blog.

If I ever launch a new product or company I'm going to call it "Bezawankafolzamullumba" or something equally as nonsensical. There are more and more "trend tools" like Google Trends, and darn it "eclipse" is used not only to find the technology, but also for the solar and lunar kind.

We know from our web server stats that 35% of all search engine hits we get are from people who simply search for "eclipse". Validating this is the fact that the Eclipse Foundation is ranked number 1 for "eclipse" on google and ask.com, second on MSN.com and sixth on yahoo. There isn't even a close second search term hitting our site. The next most popular search click through to our site is usually "eclipse download" usually running around 2.5%. So the data is clear - by orders of magnitude and stderr, when a developer wants to find Eclipse (the technology), they simply search for "eclipse", and are usually rewarded with a #1 hit. Unfortunately, when you look at a trend tool for "eclipse", the data is perfectly washed out by the solar and lunar calendar.

With meme miners and other tools, I've tried working with various modifiers like "eclipse ide", "eclipse java" or "eclipse download" but there are flaws with that approach as well. Not only is there is never a close second to "eclipse" on web search referals, there is almost always a fairly random grouping of search terms in the 8th through 50th rankings. For example, "eclipse php" is sometimes quite high, and sometimes drops a bit lower.

The reasoning is pretty simple (and google trends validates this quite nicely) -- a lot of searching is done in response to current events. If there is a story in popular media on a hot topic, then the searches with the modifiers like "php", "rcp" or "swt" spike. Moreover, there is often strong correlations to searches of related products. When JDeveloper announced it would become free in June last year, their search trends spiked and so did Eclipse (it's hard to see in that link, but there is a clear co-incedent spike at the same time). When the marketing hype leading to Netb eans 5.0 was happening in late 2005, early 2006, there was a clear and coincident spike in searches. Bottom line - just as industry news pulls related stocks up and down as a group, there is clear indicators that search volume is similarly linked.

Another impact is the organization of your website. It's not at all uncommon to visit a website, and then revert to a search enging to look for something within the site (even when the site offers a search ability). For example, if we removed all links to the download page from eclipse.org, I'd bet anything that we would see a major spike in "eclipse download" search hits.

Imagine the use case - you want to download Eclipse. So, you google "eclipse", click to eclipse.org, don't immediately see a download link, you go back in the browser, search "eclipse download" and voila, two search hits for the price of one. Looking at the trends (not only from google, but from our own stats) it's clear - when our website is updated and refined to make it easier for people to find what they want, search traffic drops and there are fewer page hits.

So although I find search trends to be a very useful tool to understand an ecosystem, one has to be very careful understanding the data. It's why I really like to rely on sources like job postings, raw download stats and other "harder to bias" stats.

Any other factors impacting search trends? One quick thought is the obvious calendar influence. Much less searching over Christian holidays it seems.

- Don

1 Comments:

Post a Comment

<< Home