Simple Search | Advanced Search | Help Searching StatCat | Help Using Data | About StatCat
StatCat Search Help
Words in a search are automatically connected with AND. That is, a search on income education will find records with both the word income AND the word education.
Use Boolean operators (AND, OR, NOT) to combine search terms. For example: income OR education finds records with either word in the record. income NOT education finds records with the word income but not the word education. AND, OR, NOT must be in capital letters.
Use quotation marks to search phrases. For example: "current population survey"
Search terms are not case-sensitive.
Truncation and wildcards:
Use ? for a single character wildcard search.
For example: wom?n finds women or woman.
Use * for a multiple character wildcard search or to truncate a word.
For example: test* finds test, tests, testing, etc.
Use + to indicate that a term must appear. For example, income +education will locate records containing the word "education"; if the record also contains the word "income", it will be ranked higher in the results.
See the Jakarta Lucene Query Parser Syntax page for information on constructing additional types of searches.
"Keywords anywhere" searches the following fields (for field definitions, see below):
Abstract
Author
Bibliographic Citation
Data Collector
Distributor
Geographic Coverage
Geographic Unit
Holding Notes
Keyword
Producer
Related Materials
Related Publications
Related Studies
Series Name
Series Information
Title
Subject headings are no longer used in StatCat and the "Browse Subjects" function is no longer available.
Abstract
A summary description of the data collection.
Author
The name of the study's principal investigator(s). Authors can be individuals,
organizations, or a combination of both.
Bibliographic
Citation
Use the bibliographic citation in papers or publications based on the data.
For more information on citing ICPSR versions of studies, see Citing
Electronic Data Files.
Case
Count
The number of cases or observations in the data file. For hierarchical data
files and non-data files, this field is filled in as "inap."
Class
or Status of the Study
Indicates the processing status of the study; data distributors may use
a class or study status number to indicate processing status.
Collection
Notes
Used to describe details about the data collection that are not recorded in
other fields.
Data
Access Information
Links to details of data holdings available on CD-ROM or on the Statlab
server at Yale or available on the Internet.
Data
Appraisal
Describes issues such as response variance, nonresponse rate and testing for
bias, interviewer and response bias, confidence levels, question bias, etc.
Data
Collector
The individual, agency, or institution responsible for administering the questionnaire
or interview or compiling the data.
Data
Format
Physical format of the data file, such as: logical record length format (LRECL),
card image (i.e. data with multiple records per case), OSIRIS, SPSS Portable,
SAS Transport, delimited format, etc. For more information, see ICPSR's Types
of Data Formats and Types
of Data Structures.
Data
Source
Describes the type of technique or data collection instrument used to collect
the data. Also used to list any book(s), article(s), serial(s), and/or machine-readable
data file(s) that served as the source(s) of the data file.
Date
of Collection
Date(s) when the data were collected (as distinguished from Date
of Production or Time Period).
Date
of Production
Date the data collection was produced (as distinguished from Date
of Collection or Time Period).
Distributor
The organization designated by the author or producer to generate copies
of a particular data collection including any necessary editions or revisions.
Examples: ICPSR, Roper Center.
Extent
of Collection
Summarizes the number of physical files that exist in a collection, recording
the number of files that contain data and noting whether the collection contains
machine-readable documentation and/or other supplementary files and information.
Extent
of Processing Checks
This field contains abbreviations that describe processing activities and checks
performed on data collections either by ICPSR or by others.
ICPSR's Extent of Processing Key:
CDBK.ICPSR = ICPSR produced a codebook for this collection.
CONCHK.PR = Consistency checks performed by Data Producer/ Principal Investigator.
CONCHK.ICPSR = Consistency checks performed by ICPSR.
DDEF.ICPSR = ICPSR generated SAS and/or SPSS data definition statements for this collection.
FREQ.PR = Frequencies provided by Data Producer/Principal Investigator.
FREQ.ICPSR = Frequencies provided by ICPSR.
MDATA.PR = Missing data codes standardized by Data Producer/Principal Investigator.
MDATA.ICPSR = Missing data codes standardized by ICPSR.
RECODE = ICPSR performed recodes and/or calculated derived variables.
REFORM.DATA = Data reformatted by ICPSR.
REFORM.DOC = Documentation reformatted by ICPSR.
SCAN = Hardcopy documentation converted to machine-readable form by ICPSR.
UNDOCCHK.PR = Checks for undocumented codes performed by Data Producer/Principal Investigator.
UNDOCCHK.ICPSR = Checks for undocumented codes performed by ICPSR.
File
Structure
Used to describe the structure of the file/part -- rectangular, hierarchical,
relational. Note, "inap." will appear in this field for codebook files, dictionary
files, and other non-data files. For more information, see ICPSR's Types
of Data Structures.
File
Type
Types of data files include raw data (ASCII, EBCDIC, etc.) and software-dependent
files (SAS, SPSS, etc.).
Frequency
of Data Collection
Used if data were
collected at more than one point in time (e.g. monthly, quarterly).
Funding
Agency
The source(s) of funds for production of the data collection.
Geographic
Coverage
Geographic scope of the data; may include additional levels of geographic coding
provided in the variables.
Geographic
Unit
Lowest level of geographic aggregation covered by the data (e.g. state).
Grant
Number
The grant or contract number of the project that sponsored the data collection.
Holding Notes Keyword Kind
of Data Logical
Record Length Media Mode
of Data Collection Nation Place
of Production Producer Records
Per Case Related
Materials Related
Publications Related
Studies Response
Rates Restrictions Sampling Series
Name and Series Description Study
Number Time
Method Time
Period Total
Number of Records Unit
Of Observation Universe Variable
Count Version
History
Details distinguishing a particular holding of a dataset.
Words or phrases that describe a data collection's content. In StatCat,
"keywords anywhere" searches the keyword field as well as several
others (see above).
Examples of different kinds of data include:
Number of characters in the logical record of each file or part.
On what medium or media are the data available; e.g. CD-ROM, Statlab server, Internet.
Method used to collect the data (e.g. telephone interviews, mail questionnaires,
etc.).
Country or countries covered in the file.
Address of the archive or agency that produced the data collection (see
Producer).
The producer of the data collection is the person or organization with the financial
or administrative responsibility for the physical processes whereby the data
collection was brought into existence.
Used for card-image data or other files in which there are multiple records
per case. Note, "inap." Will appear in this field for codebook files, dictionary
files, and other non-data files as well as hierarchical data files.
Describes materials related to the study description, such as appendices,
additional information on sampling found in other documents, etc.
Contains information about primary or related publications that are based on
the data, such as articles and reports.
Information on the relationship of the current data collection to others
(e.g., predecessors, successors, other waves or rounds) or to other editions
of the same file. This would include the names of additional data collections
generated from the same data collection vehicle plus other collections directed
at the same general topic.
The proportion of respondents from the selected sample who provided information.
Contains information regarding any limitations on use or restrictions on access
to the file(s). Example: "Data may be used by current Yale faculty, students,
and staff."
Describes how the cases that appear in the study were selected. The sample is
a selection out of the universe of all possible relevant cases (e.g. adults
in the United States, housing units in three counties of Michigan, etc.) that
could have been included in the study.
The name of and information about the data series to which the collection belongs,
if any.
Unique number assigned by the distributor. ICPSR numbers are four digits; Roper
numbers are a combination of letters and numbers. Other distributors may not
assign numbers to studies.
Types of time methods include: panel survey, cross-section, trend study, time
series, etc.
The time period covered by the data (as distinguished from Date
of Collection or Date of Production).
Overall record count in the file. Used in instances such as files with multiple
cards/decks or records per case.
Describes who or what are being studied: individuals, families/households, groups,
institutions, etc.
The group of persons or other elements that are the object of the study and
to which the study results refer. Age, nationality, and residence commonly help
to delineate a given universe, but any of a number of factors may be involved,
such as sex, race, income, veteran status, criminal convictions, etc. The universe
may consist of elements other than persons, such as housing units, court cases,
deaths, countries, etc.
Number of variables in the file.
This field is used to explain changes that have been made to the data collection
since its last release.

Some of these definitions
are adapted or copied from the Data Documentation Initiative (DDI)
and the Inter-university Consortium of Political and Social Research (ICPSR)
field descriptions.
Yale
University Social Science Statistical Laboratory
and Social Science Libraries and
Information Services
Reference services: Social Science Data
Librarian
Statistical consulting: Statlab
This page last modified: June 14, 2005
© 2005 Yale University
