MAIN POINTS

Introduction

The data collection methods discussed so far in the text generate primary data. Increasingly, social scientists make use of data that were previously collected by other investigators, usually for purposes different from the original research objectives; secondary data analysis refers to research findings based on data collected by others.

Why Secondary Data Analysis?

Secondary analysis has a rich intellectual tradition in the social sciences. There are three major kinds of reasons for the increased utilization of secondary data: 1) conceptual substantive (secondary data may be the only source of data available to study certain research problems), 2) methodological (there are at least five methodological advantages to secondary analysis: it provides opportunities for replication; the avail- ability of data over time makes possible the employment of longitudinal research designs; it may improve measurement; sample size, representativeness, and the number of observations may be increased; and it can be used for triangulation purposes, thus perhaps increasing the credibility of research findings obtained with primary data), and 3) economy.

Like other data collection methods, secondary analysis has certain limitations. Perhaps the most serious problem is that often it only approximates the kind of data that the investigator would like to have for testing hypotheses. A second problem is that access to such data is difficult. Third, there may be insufficient information about the collection of the data to determine potential sources of bias, errors, or problems with internal or external validity.

Searching and Sourcing Secondary Data

Guidelines for data search include specification of needs, initial familiarization, initial contacts, secondary contacts, accessibility, and analysis and supplemental analyses. The major resources available for secondary analysis searching for data are catalogs, guides, and directories of archives and organizations established to assist researchers.

An unobtrusive measure is any method of data collection that directly removes the researcher from the set of interactions, events, or behaviors being investigated. These measures range from private and public archives to simple behavior observations of people at work or play.

Simple observation is a basic variety of unobtrusive measure. There are four types: physical signs, expressions, locations, and behaviors (including language).Although there are many benefits to using simple observation, there are also limitations. First, the recorded observations may not represent a wide enough population, thereby limiting the scope of generalizations based on them. Second, it is not necessarily the case that the data lend themselves to straightforward and clear-cut explanations.

Archival records constitute a rich source of information that may be studied without direct contact with the entities being observed. There are two major sources of archival information: public records and private records. Four basic kinds of public records can be distinguished: actuarial records, political and judicial records, governmental documents, and the mass media. Unlike public records, private records are difficult to obtain; they include autobiographies, diaries, and letters.. A major problem with the use of private documents lies in the need to ensure their authenticity.

The Internet has has become an important source of reference and information for social scientists. It allows access to a host of rich secondary data sources, such the U.S. Census. A census is defined as the recording of demographic data of a population in a strictly defined territory, made by the government at a specific time and at regular intervals.

Content Analysis

Content analysis is both a means of gathering data and a method of analysis. Instead of observing people's behavior directly or asking them about it, the researcher takes the communications that people have produced and "asks questions" of the communications. Content analysis involves the systematic examination of the content of communications in order to make inferences about the characteristics of the text, antecedents of the message, or effects of the communication.

The content analysis procedure involves the interaction of two processes: specification of the content characteristics to be measured, and application of the rules for identifying and recording the characteristics when they appear in the texts to be analyzed. Five major recording units have been used frequently in content analysis research: words or terms, themes, characters, paragraphs, and items. Content analysis typically involves one of four systems of enumeration: 1) time or space measures examine the amount of time or space devoted to an issue; 2) appearance measures assess whether or not a word or theme appears in a commentary; 3) frequency counts assess how often a theme, word, or character appears in a communication; and 4) intensity measures focus on the degree to which a theme or idea is expressed.