PROBLEM: I'm going to create a SAS data set from raw data, and
I want to get an idea beforehand of how large the SAS
data set is going to be.
SOLUTION: To get an idea of how large a SAS data set will be, SAS
suggests a formula, originally printed on page 465 of
the SAS-PC (Release 6.04) Language guide. A summary follows:
The formula for calculating the size of a SAS data set is
(218 + (v*106)) + (nobs*(tvl + 4))
where v is the number of variables, nobs is the number of
observations, and tvl is the total of all variable lengths.
For example, suppose you have a data set with 2 variables (v=2),
each of which has a length of 8 (tvl=16), and 10 observations
(nobs=10). The size of the data set is
(218 + (2*106)) + (10*(16 + 4)) = 628 bytes
SAS stresses the point that this is a guideline, not an absolute
formula.
As one can see, this does not allow one to calculate the potential
size of a SAS data set from the size (in bytes) of raw data files.
There is probably no way to estimate accurately simply from the
size of the raw data file.