Friday, April 24, 2009


So what is all the fuss about? Why do we even want to consider FAST as a search engine? What FAST does that other search engines fail to do and why?

As we all remember in early 2000 it was all about data, collecting data, “information is the power”. Almost all companies jumped on “Data collection” wagon.  Few years later they were forced to collect data in the light of all laws and regulations that were imposed by government. At around the same time and even earlier wave of data analysis, and data mining became incredibly strong. Companies realized that the data they have is an unknown mass, potentially incredibly valuable, but the value of it was obfuscated by the quantity and often poor quality of the data.

SharePoint “age” started educating companies about true meaning of “garbage in, garbage out” concept. I personally often tell clients that SharePoint deployment stage should be treated as “Spring Cleaning” and reorganization of content.

But is “Spring Cleaning” approach always valid and feasible? No.

Can large enterprises afford such undertaking? Most likely not. Because hiring an army of temps to tag documents will not cut it, since they are lacking subject matter expert’s knowledge and business insight.

Properly configured FAST can impose multiple layers of information taxonomy structure on already existing content, thus providing unparalleled insight and access to information.

When FAST “bug” bit me, I tried to be devil’s advocate for myself and started researching FAST competitors in the enterprise search market, such as: Autonomy, Endeca, and Google appliance.  Unfortunately ns my FAST competitors research journey, I could not pass through sales reps, who had troubles answering even basic questions, to more “techy” people (and I’m so far drinking sales reps cool-aid). I had to resort to comparison of the type and level of influence or administrative intervention you have on the search engine, in terms of content relevance, analysis, ranking… and you name it. FAST had prevailed on many levels over other guys.

If we take the analogy of apple trees in an orchard. (ha-ha-ha, my Russian really comes through here) Companies say: I want Google quality search within enterprise.

Google is famous for Internet search, but this is the same as you going to a free range farm and shaking an apple tree. Then using the fruit that falls to the ground where you have no control over what apples fall first, and their pattern of falling and arrangement on the ground does not clue you into their quality of “fruits”.

Using FAST Search is like going to a farm which specializes in cultivating the best apples. So you get the best fruit all the time. It is presented to you in ways that you need by the ton, baskets, 5 pound paper bag, raw, or processed into apple cider, puree etc. (even by colors)

In conclusion, I just wanted to add couple of facts about FAST (in no particular order):

  • FAST can handle 40,000 Terabytes/ or 40 Petabytes of data
  • 10 billion documents
  • 2,000 queries/second

And  expansion of it all through Federated installations!

And according to Dstar – Data News company:

  • 1,000 terabyte is about one-hundred times the contents of the Library of Congress, the largest library in the world, with more than 18 million books, 2.5 million recordings, 12 million photographs, 4.5 million maps, and 54 million manuscripts

According to

  • Approximately fifteen thousand terabytes of data will be generated each year in particle physics experiments using CERN’s Large Hadron Collider, launched in May 2008
  • As of November 2006, eBay had 2,000 terabytes of data
  • In 2007, NOAA maintained approximately 1,000 terabyte of climate data. NOAA expects that their Comprehensive Large Array-data Stewardship System (CLASS) library will hold 20,000 terabytes of data by 2011, 140,000 terabytes by 2020

Isn’t it alone impressive?

Enjoy :-)

Technorati Tags: ,

1 comment:

freddiemaize said...

definitely impressive. Especially the apple-fruit-farm example!!