/************* for 1990 Census Files ********************************* NOTE: You can copy this to a file directly from here, and use the program shown below as a sample, so that it need not be re-typed entirely. Depending on your browser, you can probably click 'File' then 'Save As' to save this into a usable file. ********************************************************************** Making Subsets of 1990 STF1A Census Data for Use in Packages Other than SAS or SPSS ---------------------------------------------------------------------- Some of the Census files are larger than 1 Gb, and as such, often cannot be moved to UNIX disk. This means that creating a text-only (ASCII) subset of the data may be helpful. SAS can be used for this purpose. When the data you want are in a SAS data set, a simple SAS PUT statement can output the data into "raw" (ascii) form to be read into whatever package you are going to use. A sample SAS program to read in a subset of Census variables from the STF1A file and output raw data is shown here. (This directory also contains a similar file for STF3A data.) For the example, we assume you are using seven Census variables, five from the first record and two from the second. (Each "logical" record of the STF1A data is split into two physical records of 4805 bytes each. These are referred to in the Census documentation as "segments".) Since some Census variables are character variables, which result in completely blank spaces if missing, the output program specifies column beginning points using the @1, @7, @12 (etc.) pointers, so that the data 'line up' regardless of width or missing values. An example of the output follows the program. ***********************************************************************/ FILENAME RAW '~datastor/census/census1990/census90.stf1a.ca'; DATA CENSUS; INFILE RAW LRECL=4805; INPUT FILEID $ 1-8 STUSAB $ 9-10 SUMLEV 11-13 GEOCOMP $ 14-15 CHARITER 16-18 / H13_1 2389-2397 H19 2767-2775; FILENAME SUBSET 'my.raw.census.subset'; DATA _NULL_; SET CENSUS; FILE SUBSET; PUT @1 FILEID @10 STUSAB @15 SUMLEV @20 GEOCOMP @25 CHARITER @30 H13_1 @40 H19; RUN; /********************************************************************* (NOTE that even if you are not reading any variables from the second physical record or segment, you still must have the slash in the INPUT statement so that SAS knows there are two physical records per observation.) To use the above file for your own work, you need change only the INPUT statement and the PUT statement. For the INPUT statement, simply list the variables and column specifications (with the $ if you're reading an alphanumeric variable) from the Census code book for those variables you want to read. For the PUT statement, just list the same variable names (but no columns or $, unless you need to have the values lined up in columns) that you used in your INPUT statement. To run the program, type it into a UNIX file, then enter sas SAS will create the raw data file 'my.raw.census.subset' (or whatever you decide to call it). The contents of this file, as generated by the example above, will look like this (only the first ten observations are shown): STF1A CA 040 00 0 457885 29008161 STF1A CA 040 40 0 764 45312 STF1A CA 040 42 0 0 0 STF1A CA 040 43 0 0 0 STF1A CA 050 00 0 17950 1242068 STF1A CA 060 00 0 890 68633 STF1A CA 070 00 0 890 68633 STF1A CA 080 00 0 8 3288 STF1A CA 091 00 0 5 1190 STF1A CA 091 00 0 0 813 You can then use this raw file as input to the stat package of your choice. **************************************************************/