Statistics Refreshers and Glossaries of Terms
Beware the procrastination monster. Statistical analysis takes time.
The help sections of the statistics packages are often helpful resources for refreshers on a concept you may have forgotten. In particular, the help sections of StatPlus:Mac LE and SPSS have useful information. The best resource available to you for a statistics refresher is Paula Lackie (x 5607, or email plackie). Helpful links to all-things-stats-related can be found (organized by category) at http://statlink.tripod.com/.
Additionally, there are online learning programs and tutorials available.
- Carnegie Mellon offers a free online program for learning statistics available here. It is extremely well done.
- ICPSR has many learning guides available in their online learning center. For guides organized by type of analysis, go here. ICPSR also has a help center with information on using data.
- http://www.teachingwithdata.org/
- http://statpages.org/javasta4.html
- UCLA has a beautiful table for choosing the appropriate statistical analysis .. including HOW TO DO IT in Stata & SPSS! It's slightly out of date, but useful nonetheless.
- What the heck is the difference between Categorical, Ordinal & Interval data? (psst: "categorica" is also called nominal and "interval" is like scale or continuous.)
Helpful online glossaries of terms:
- A glossary of social science data terms is available here.
- NSF glossary of statistical terminology
- Internet Glossary of Statistical Terms
- Glossary of Terms for Metadata, Taxonomies, and Digital Libraries
- More information about data formats and file types from ICPSR
Additionally, there are links to guidance on choosing the right analysis method available:
- StatPages provides some decision trees and other helpful information
- An explanation of parametric vs non-parametric data
For help with qualitative analysis, you can check out this website:
Below is a beginning list of file extensions & what they're usually associated with.
| File Extension |
Type of File |
Associated Software |
Important Details |
| .sav |
data file |
SPSS |
|
| .spo |
data output file | SPSS |
|
| .xpt |
portable data file |
SAS | |
| .spv |
output file |
SPSS |
|
| .por |
portable data file |
SPSS |
Good for interoperability |
| .dta |
data file |
Stata |
|
| .txt |
ASCII text file |
anything.. but.. |
.txt files can have the data arranged many ways (comma or tab delimited, flat, rectangular, hierarchical, no delmiiters ...) It also may not have metadata associated with it. The metadata will need to come from someplace else (typically a "data dictionary" or "Codebook".) |
| .sps |
syntax/program/code file |
SPSS |
Creating syntax files makes it easy to reproduce your work in a matter of moments (without a lot of clicking around). You usually need a syntax file to access “raw” (just a file full of numbers, not in any format) data. |
| .sas |
syntax/program/code file | SAS |
We don't have any publicly available SAS licenses. If this code is all you can get your hands on with bare text data, get it! Paula will help you translate it into something you can use in other stats packages. |
| .dct or .do |
syntax/program/code file | Stata |
|
| .csv | comma separated values | most packages | This is the native version for R, and it's easily opened in most data packages. It is likely to have column headers (variable names) but no labels or other useful metadata. |
| .tsv | tab separated values | most packages | This is easily opened in most data packages. It is likely to have column headers (variable names) but no labels or other useful metadata. This is a good format to use when you export your file if *any* of your fields could have commas in them. (Surveys, for instance.) |
| .xls | Excel® versions before 2007 | Excel and many/most others | This format is easily accessible to most stats packages. Like .tsv or .csv, there is likely to be column headers but no labels or other useful metadata. Also, if any fields have special identifying characteristics, it's inconsistent how they will be translated into other programs. (eg: date, time and currency) |







