Questions related to “big data” have many different facets depending on how “big” the data is and the context of their use. For example, the use of computers in financial markets has given rise not only to a new discipline (econophysics) but to a complete change in how financial data are modeled, using methods like renormalization group that are common in statistical physics and quantum field theory. RG methods are particularly useful in treating complex systems where strong compression of data is required since many of these systems exhibit divergences, resulting in an infinite number of degrees of freedom. This problem of dealing with “big” data is, however, somewhat different from the problem of “really big” data one encounters at the LHC. Here we require much more than mathematical techniques for data reduction and instead rely on a sophisticated and complex framework for filtering, processing, storing and analyzing both signal and simulated data. In addition to these problems is the larger issue of data “validation”. I discuss this comparison between “big” and “really big” data, the important differences, and some of the more problematic issues related to validation.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s