Search

Google Suggest does type-ahead for Google Index in near real-time

Go try now, if you haven't. Google Suggest shows the feasibility of using type ahead with very large collections of terms, like tags in a folksonomy.

Now, one of the drawbacks of using ad hoc tags in is the lack of vocabulary control - people use different tags to mean the same thing. This is fine for organizing personal information architectures, but the lack of consistency, while reducing the cognitive cost of classification, actually increases effort in finding information.

To deal with the issue, there needs to be a feedback loop. has the most popular tags float to the top, and others use type size to show more popular tags. There's an argument for that kind of subtle feedback. However, to really bridge between levels of classification, to move from a distributed folksonomy to a controlled vocabulary and then to a formal thesaurus, we need more than implicit incentive in using a particular tag. Using type ahead to show other tags is one way of doing that, as James Spahr . But I've always wondered about how scalable this approach would be with a massive tagset. With Google Suggest, instead of wondering how type ahead would scale, I'm wondering how we can implement a similar scale system for tags...

More IA Heuristics - Search

Lou Rosenfeld , this time focused on search.

Enhance Usability by Highlighting Search Terms

A List Apart offers a in the page that were searched for by the user. You can to see the script in action.

The Truth About Federated Searching

, a provider of federated search technology has compiled , published in Information Today.

Do query string operators matter in search interfaces?

Research has reported that 90% of search engine users utilize query string operators, while the remaining 10% perform simple queries. Do boolean operators and "must include" (+) and phrase ("") operators make a difference in search engine results? Mostly no but sometimes yes according to this paper in ACM Transactions on Information Systems (Volume 21 ,  Issue 4  (October 2003). Caroline Eastman and Bernard Jansen tested the effects of using query string operators on major search engines in their paper, "" to determine if these operators improved the effectiveness of web searching. When they say effectiveness, they are referring to relevance and relative precison of retrieval.

The paper attempts to find out if the use of certain query string operators makes any difference in search engine results. They found that implicit OR combination had a negative effect on performance and implicit AND had a positive effect on performance. As of their writing, MSN and AOL used implicit OR while Google appears to be using implicit AND. They found, generally, that most query string operators did not have a great effect on precision in the search engines tested. Precision was as high for simple queries as for advanced queries using query string operators. They did find, however, that in search engines using implicit OR, phrase operators sometimes had a positive effect on performance. [Note that this research didn't test exclusion operators (i.e. boolean NOT or the minus (-) operator). ]

So summarizing, there is limited advantage to using OR, and possibly some advantage to using PHRASE operators in some search engines. But generally speaking, these query string operators provide little or no benefit to users and are counter productive in some cases. Interesting? Maybe. I suppose this is saying that most search engines are doing better to match users expectations when doing simple searches. With 90% of the population using simple searches, those sophisticated algorithms on the back end become more important. They make a note that while it may hold true for general search engines that query string operators are less important, there is a place where they are still necessary in order to achieve satisfactory results -- in IR systems that do not have sophisticated matching and ranking algorithms.

Web searches: are they fixed?

Interesting article in Business Week Online regarding paid placements and some potential controversy involving small businesses. I found the link at .


by Ben Elgin, October 6, 2003

Dublin Core 2003: Seattle, WA

The is currently going on in Seattle this week. A couple of the attendees and I will be sharing our notes(and photos) when we've recovered(it's actually still going on). But until then, enjoy the .

Trumping Google? Metasearching's Promise

Information services organizations (libraries) continue to be challenged by information seeking behaviors and expectations of web search engine users. In a recent Library Journal article, Judy Luther . In the article she writes, "For many searchers, the quality of the results matter less than the process -- they just expect the process to be quick and easy." Anecdotally, I've found this to be true of users I've encountered within my organization. For more exhaustive and relevant searching, these searchers can turn to researchers for help -- that is, real subject matter experts who know the sources and how to search them.

Searching multiple databases is a special kind of problem because the databases don't always share the same controlled vocabularies or use the same protocols (e.g. Z39.50, XML). But there is great advantage to users viewing intermixed and deduped search results from multiple sources. The search engine or the federated search engines of the California Digital Library are good examples that show how this works. At the SLA conference, federated search seemed to still be a buzzword among search vendors.

See also the related article on by Roy Tenant.

Putting it Together: Taxonomy, Classification & Search

from Jeff Morris in Transform magazine. Combining taxonomy and classification with search gives people a map of the resources available to them. This kind of taxonomy, classification and search combination is becoming essential for the major search vendors. [thanks ]

Video search on PBS.org

Gary Price that PBS is offering free keyword and or title search of some of its video. Being the father of a two-year old, I do regular visits to with my son to look for animal videos. Now I can add Nature to my bookmarks.
Searching is quite nice on PBS. You can do keyword searches or browse by show/program title. Odd that they don't let you view the metadata, though. I wondered after searching the Nature archives for "leach" why vampire bat and mosquito videos were returned when what I wanted to find was the blood sucking leaches from the same "Blood Suckers" show. I guess there is one metadata record shared per show, which I guess makes sense when there are only 2 or three short videos available per program. That way, obviously related videos are presented in your search results.
Available presently on PBS:

Amazon Plan to offer full-text search of some non-fiction texts

Very interesting news from Amazon today . The retailer is planning a new full-text searching service called "Look Inside the Book II" that will combine some of the functionalities of a digital library with the retailers' current methods for helping customers find and evaluate products. The full-text service will extend the "Peak inside" service that users get when previewing TOCs, indexes, and sample pages with "Look Inside the Book". I couldn't surmise from the article whether full-text searching would be offered only when viewing a single book or if it would be possible to do full-text searching across a corpus of digitized e-texts.

The new service is being met with some wariness from publishers and authors who worry that the service will make Amazon more like an information service a la ebrary and netLibrary and undoubtedly Amazon will have to do a lot to protect copyright.

Being someone who uses e-text vendors and full-text digital libraries, I think the service could be a boon to the book selling industry. There is no reason that full-text searching of some non-fiction works can be offered without protecting copyright. If brief keyword in context (KWIC) displays of search terms are given to offer some help in filtering out and refining your search without publishing too much information, then how can this hurt publishers? No doubt, some works such as reference books would give away too much in even a brief KWIC display, but surely there must be a way to make this work. I think it's a good step in making the Amazon shopping experience even more valuable. It's amazing that they continue to innovate the experience of buying online.

Federated search overview

I recently heard Roy Tennant tell a group of information professionals that "only librarians like to search, everyone else likes to find." Roy should get together to combine his findability meme with this appropriate tagline.

In keeping with this findability theme, Tennant's describes some of the in his latest Digital Libraries column in Library Journal. This is an area that is hot with vendors in the information searching space right now.

CIO Article on Auto/Semi-categorization software

CIO article by Fred Hapgood features a couple examples of how auto-semiauto categorization enables businesses and reduce costs. There is a company list included if you're interested in this arena.

Data Management meets Unstructured Information

Just came back from a conference on . A recurring topic that surfaced about data management was the relevance of their work in relation to unstructured information. A reality check for everyone was that most corporate information actually existed in semi-structured of unstructured information and not in databases. From this thought, I was directed to DM Review and in particular this article. - By Robert Blumberg and Shaku Atre. I definitely see an opportunity between IA(metadata/ux) type folks cross-pollinating with data modelers and data managers. It will be interesting to see and I look forward to hearing more from here. Thoughts?

New Yahoo! Search debuts

Yahoo has debuted its . Much cleaner, and looks like it's aimed directly at Google. I like the search results screen a lot...it does a great job of showing what index (web, directory, images, etc.) the results are from.

There's a tour with highlighting different search elements. Something else interesting is the use of - prefix the word 'map', type an address, and you're hooked into Yahoo! Maps; type 'weather' and a city, and you've got the forecast; type a zip code with your search and you're looking at local Yellow Pages. Reminds me of parts of Paul Ford's . While Google makes a good foil, its not the only player that pays attention to such things.

Another interesting feature - you can "ScreenDial" around Yahoo - type and exclamation point, and get to a specific screen: So mail! goes to Yahoo! Mail, while news! goes to...well, you get the picture.

Excellent stuff, and congrats to the Yahoo! search team :-)

I'm curious though - what do ia/ readers think? Improvement? Google-envy? What could be better? What is outstanding? Let us know in the comments...

Update LOL - Andrés Sulleiro points out promotion.

Good discussion over at

A day in the life of BBCi Search

for a site that should get as much attention as Amazon for the content producing crowd. BBC is doing a lot of innovative things, and more importantly, the process behind the innovation gets shared on a regular basis.

Customer Experience Whitepapers

Change Sciences has an archive of they've produced. Free registration required. Topics include writing for the web, navigation and orientation, search, checkout, user registration, and two interesting 'design paradoxes' articles. Most interesting to me is the recent task design article, and the two older, but still valuable ROI & Investing in User Experience papers.

Semantic search project for Moveable Type

From

Maciej Ceglowski has built a prototype for a semantic search engine. To adapt it to function as a Movable Type plugin, he needs sample content that he can test against.

If successful, the search feature would let you do a keyword search, and get back relevant results even when there was no exact keyword match.
If you use Moveable Type, and you'd like to help out, .

Maciej is using to enable local search beyond keyword indexing. Sounds like an ambitious and exciting project.

Why can't everyone have site searches like this?

While searching for some obscure hardware from antiquity on the site I spotted that they have an extremely cool site search system. Just searching does the standard things. Once you have your results the search box also gives options for "Fuzzy", "Stemming", "Phonic" and "Natural Language". I think these options are great for rerefiining a little better. Clicking on each of the options brings up a window with a handy definition. I just thought more sites should give users a little bit of flexability and credit for understand concepts like WD.

Evaluating 25 E-Commerce Search Engines

pointed to this new 37signals report, , a $99 report with 22 Best Practices for E-Commerce Search Engines.

XML feed