20130191523 | REAL-TIME ANALYTICS FOR LARGE DATA SETS - A cloud computing system is described herein that enables fast processing of queries over massive amounts of stored data. The system is characterized by the ability to scan tens of billions of data items and to perform aggregate calculations like counts, sums, and averages in real-time (less than three seconds). Ad hoc queries are supported including grouping, sorting, and filtering without the need to predefine queries by providing highly efficient loading and processing of data items across an arbitrarily large number of processors. The system does not require any fixed schema, thus the system supports any type of data. Calculations made to satisfy a query may be distributed across a large number of processors to parallelize the work. In addition, an optimal blob size for storing multiple serialized data items is determined, and existing blobs that are too large or too small are proactively redistributed or coalesced to increase performance. | 07-25-2013 |