PROBLEM:
How do I convert an SPSS Portable file to a SAS Data Set?
************************************************************
SOLUTION:
SAS can read data originally saved in SPSS if it is in the form
of an SPSS Export file. SPSS Export files are created in SPSS
with the EXPORT OUTFILE=<filename>. command, or by clicking
File>Save As and creating a *.por (portable) file. If they are
created in another system, they are moved (*not* in binary mode)
to the system where SAS will be converting them.
NOTE: The issue of converting SPSS Value Label information into
SAS is a separate topic covered in detail below. If you
have no Value Labels in SPSS, or are willing to 'do without'
them in your SAS data set, you may ignore that section.
In UNIX
-------
This example shows the statements needed to read an SPSS Export
file and output a SAS Data Set in UNIX:
libname test '~/sasdata'; * This is where the SAS
Data Set will be written;
libname apple spss 'spss.por'; * This names your SPSS Export file
and uses the SPSS read engine;
proc copy in=apple out=test;
run;
proc print data=test._first_ ; * This is to see if it works;
run;
You may want to rename the SAS Data Set (_first_ is somewhat awkward).
This can be done with an operating system command (in UNIX, the
command is "mv") or in SAS. To rename it in SAS, run the following
statements:
proc datasets library=test;
change _first_=avocado;
run;
In Windows or Macintosh
-----------------------
To use the above example in other operating systems, the two LIBNAME
statements must be changed to point to directories and/or files
appropriate for the operating system; otherwise, the rest of the
statements shown above are exactly the same under SAS for Windows
or Macintosh.
SAS for Windows (or SAS-PC) LIBNAME statements:
libname test 'c:\sasdata'; * This is where the SAS
Data Set will be written;
libname apple spss 'c:\spss.export'; * This names your SPSS
Export file and uses the
SPSS read engine;
SAS for Macintosh LIBNAME statements:
libname test 'Hard_Disk:MySASFiles:'; * This is where the SAS
Data Set will be written;
libname apple spss 'Hard_Disk:Desktop:spss.export';
* This names your SPSS
Export file and uses the
SPSS read engine;
Converting SPSS Value Labels into SAS User-Defined Formats
----------------------------------------------------------
There is no method provided by either SAS or SPSS to convert
SPSS Value Labels into SAS Formats. Many difficult issues are
involved, but the problems mostly stem from the following
basic difference:
SPSS Value Labels are stored along with the data in the
SPSS Active File (or, in permanent form, in the SPSS Save File)
SAS labels for individual values are called Formats, and are
stored in a Format Library which is a SAS Catalog separate
from the SAS data set.
UCS provides a program to perform this conversion. It is called
spss2sas.prg. It may be
downloaded from this link by clicking on the filename.
The program will work
in SAS for Windows or SAS for Macintosh. Once downloaded, you can
run the program as shown in the instructions below.
The program will display some windows in which you can specify the
information necessary for SAS to convert the SPSS Value Label
information into SAS Formats. Before running the SAS program, of
course, you must prepare the SPSS Data file to be processed by SAS.
Here are the instructions for converting SPSS Data and Value Labels
into a SAS Data Set with SAS Formats:
1. The best thing to do first is to choose a name for the set
of files that the program will use and create. The name
should be 8 characters or less, and must begin with a character.
Let's say you choose 'my1st' as the name. You should use
that name when you create the two SPSS files used as input
(described below):
my1st.por my1st.txt
And you should use the same name when prompted within the
program's windows for the SAS data set (which would become
my1st.sas7bdat) and the prefix for the two format-name files
which the program creates:
my1st.prc my1st.fmt
When choosing a name, keep in mind that it should be
meaningful -- related to the nature of the data or project
with which the files are associated -- and it must be unique,
so as not to overwrite or conflict with other files in the
directories you use to store files.
The example 'my1st' will be used in these instructions.
2. Create a text file with the SPSS Value Label information in it.
Each version of SPSS gives you the ability to display and save
the Dictionary information for your SPSS data. The spss2sas
program uses this information to create the SAS Formats later.
Once the data are read into an SPSS Active File, you can display
and save the Dictionary information, which includes the Value
Labels. In the Windows and Macintosh versions, the Active File
will appear as data in the spreadsheet-like Data Window.
Before creating the text file as described here, it is important
that you DELETE everything from the Output window just before
running the DISPLAY command, or just before clicking the
File Info button. This instruction will be repeated below.
In SPSS for Windows or Macintosh: With the Active File in place
(i.e., the data are visible in the Data Window), and the Output
Window either deleted or cleared (to remove any previous
output), open a Syntax window and enter and run the following:
DISPLAY DICTIONARY.
EXECUTE.
Click File > Export, then choose the *.txt file type, then enter a
path and file name (e.g., c:\sastuff\my1st.txt) and click OK.
3. Save the Active File into a Portable (*.por) file. In the Windows or
Macintosh versions, with the Data Window (Active File) active,
click File>Save As, then choose the *.por file type, enter a
file name (e.g., my1st.por) then click OK. If you prefer to use
a Syntax window, enter and run the following:
EXPORT OUTFILE='my1st.por'.
EXECUTE.
4. Now that you have created the SPSS files needed for input, you
are ready to run the SAS program that does the conversion.
If you want your own copy of the SAS program, download it from
http://www.usc.edu/ucs/userserv/statistics/sas/faq/programs/spss2sas.prg
If you prefer, you can find a copy in almaak (RCF) or aludra (SCF).
After the last window appears and you exit the program, you will
have a file called spss2sas.prg.log, which will contain a record
of the conversion program's work. If there were problems, they
will be noted in this LOG file.
NOTE: When PROC COPY reads an SPSS Portable (*.por) file that
is bad, there is no information to this effect in the Log.
It looks like everything went OK. If you see a later error
message to the effect that CONVERT._FIRST_ was not found,
you should suspect that the SPSS Portable file is bad, and
re-create it, ideally on the same system where the SAS
program is running.
5. The conversion program creates a SAS data set, stored in any directory
you specify. In addition, the program creates two files that
are used during the program's execution and are then saved in the
directory you specify, in case you need them in the future.
a. my1st.prc -- this is the PROC FORMAT statement that
creates all the formats necessary to accomodate what were
the SPSS value labels. Formats are named consecutively
and begin with the prefix letter(s) you specify while
the conversion program is running. For example, if
you specify 'ab' as the format prefix, the program
creates AB1FMT., AB2FMT., AB3FMT., and so forth, until
all needed formats are created. If you chose 'temporary'
when the program asked you about how to store the formats,
then this PROC FORMAT statement stored in 'my1st.prc'
will need to be run each time you use the SAS data set
created by the conversion program. If you chose 'permanent'
there is no need to run this PROC FORMAT each time (though
it will be necessary for the formats.sct01 or formats.sc2
file that holds the permanent formats to be made available
to SAS whenever the SAS data set is accessed. See usage
examples below).
b. my1st.fmt -- This is the FORMAT statement that is used
when the SAS data set is being created by the conversion
program. Normally, you will not need to use this FORMAT
statement as such in the future, but it could be useful
as a 'map' to which formats are being used for which
variables. Since the format names themselves have to
be arbitrary (e.g.supra, AB1FMT., AB2FMT., etc.), this
will be the most convenient way to match variables and
their format names.
6. Here are some examples of how to use the SAS Data Set and
Formats created with the conversion program. The examples assume
that you have stored all files in a subdirectory called
c:\sasstuff or HardDisk:SasStuff, depending on the operating
system. These examples will use the first (i.e., Windows)
syntax. If your operating system is a Macintosh, substitute
accordingly.
If you chose 'permanent' formats:
--------------------------------
libname library 'c:\sasstuff';
libname storesas 'c:\sasstuff';
proc print data=storesas.my1st;
run;
The LIBRARY libref is necessary to point SAS to the permanent
formats catalog.
If you chose 'temporary' formats:
--------------------------------
libname storesas 'c:\sasstuff';
%include 'c:\sasstuff/my1st.prc';
proc print data=storesas.my1st;
run;
The %INCLUDE statement calls in the PROC FORMAT stored in
my1st.prc. If you prefer, you can use other methods to
run the PROC FORMAT statement found in my1st.prc. For example,
you can copy my1st.prc into another file, such as now.sas,
then add the LIBNAME and DATA or PROC steps of your choice,
then run the now.sas program as a unit. It might look like:
proc format;
value ab1fmt. <etc., as found in my1st.prc>
. . .
libname storesas 'c:\sasstuff';
data subset1; set storesas.my1st;
keep <variable list>;
proc freq; tables <variable list>;
run;