|HPSC -- High Performance Statistical Computing for Data Intensive Research|
|All Websites: pbdR | HPSC | Phyloclustering | R_note | About me ||
Next Generation Statistical Analytics
This page should be done when I have time. A few ideas are leaved at here and it will happen someday.
``Today's Big Data, Tomorrow's Tiny Toys.''
``Be Bigger, Be Faster, Be Easier, and Be Greener.''
``All models are wrong but some models are useful.'' -- George Box
How about I say ``All data are wrong but some functions of data are useful.'' ``Is sufficient statistics a function of data?'' How about communicate with ``minimum sufficient statistics''? ``Isn't Statistician doing this for looo...ooong time?''
''Big Dream of Bigger than Big.'' No worry about problems of large memory and out-of-memory. Simply don't load all data in one machine. Ambitiously try smart phone.
No more parallel computing, but ``Read in distributed, Compute in distributed, Statistics in distributed, Output in distributed.''
Useful one-pass algorithm of statistics should be established
for on-fly stream data analysis and in-situ experiments.
||[ Go to top ]|
Created: Oct 19 2011 |
Last Revised: Feb 13 2013, 12:20 (CDT Ames, IA, USA)
Maintained: Wei-Chen Chen
E-Mail: wccsnow @ gmail.com
Best Resolution |