Revision as of 07:35, 23 April 2012 by Rhea (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)


Overview of CRSP/Compustat Merged Database in WRDS

Retrived from course forum for ECE695: "Financial Engineering", Spring 2012


The CRSP/Compustat Merged Database (CCM) is comprised of CRSP and Compustat data together with the link and link-history references between these two databases. It includes Standard & Poor's Compustat data, reformatted into CRSP's propreitary CRSPAccess database format, plus additional data tables that map the CRSP permanent company and security identifiers (PERMCO and PERMNO, respectively) to the COMPUSTAT permanent company identifier (GVKEY).

CRSP Compustat
Covers stock market data on major stock exchanges (NYSE, AMEX, NASDAQ) Covers basically accounting data for public, OTC and private companies
Main Identifiers: PERMNO and PERMCO Main Identifier: GVKEY
Secondary Identifiers: Header and Historical 8-digit CUSIPs Secondary Identifers: Header 6-digit CUSIP (named CNUM)
Other identifiers: Ticker Symbol and Company Names, Header and Historical Other Identifiers: Header Ticker Symbol and Company Names
A firm can have mutiple securities
Special Compustat records: Red-Herring Companies
Data after IPO In general, Compustat requires that a firm has some number of years of history as public company before including it in the dataset.
Data since 1925 Data since 1950

More in-depth article is available at here

CRSP-CompuSTAT Merged database related data sets

CCM_LOOKUP For web query CODE_LOOKUP use only
CCM_QVARDS Dummy data set for web query QVARDS use only
CCMXPF_LINKTABLE CRSP/COMPUSTAT Merged - Link History w/ Used Flag
CCMXPF_LNKHIST Native Link usage provides access to Compustat records, regardless of whether or not securities are in the CRSP universe.
CCMXPF_LNKRNG Information on the fiscal periods associated with each used link for each time series calendar frequency and keyset
CCMXPF_LNKUSED LINKUSED structure builds a composite Compustat record from one or more Compustat GVKEYs and IIDs linked to a CRSP PERMNO
COMPHEAD CRSP/COMPUSTAT Merged - Company Header Information
COMPHIST CRSP/COMPUSTAT Merged - Company Header History
COMPMASTER Master Defined Structure
SECHEAD CRSP/COMPUSTAT Merged - Security Header
SECHIST CRSP/COMPUSTAT Merged - Security Header History
To see which columns exist in each dataset, you need to go to Datasets and click table name such as CCMXPF_LNKHIST. Then it will show
Position Variable Name Type Length Description
1 gvkey Char 6 Standard and Poor\\\'\\\'s Identifier
2 linkprim Char 1 Primary Link Marker
3 liid Char 3 Security-level Identifer
4 linktype Char 2 Link Type Code
5 lpermno Num 8 Historical CRSP PERMNO Link to COMPUSTAT Record
6 lpermco Num 8 Historical CRSP PERMCO Link to COMPUSTAT Record
7 USEDFLAG Num 8 Flag marking whether link is used in building composite record
8 linkdt Num 8 First Effective Date of Link
9 linkenddt Num 8 Last Effective Date of Link

Lost of sample SAS programs are given such as

SAS Files Description
CCM_XPF/ccm_lnktable This program shows the main processing part of the CCM web queries. XpressFeed version CCM product provides more detailed information on linking history, in the meanwhile it also populates much more observations than the old version, among which there are some consecutive records (consecutive linking date range) with the same GVKEYIIDPERMNO pair but different LINKTYPE or USEDFLAG or LINKPRIM value. This program will collapse such consecutive rows and combine them into one.
CCM_XPF/ccmfunda *Use the CRSP Compustat Merged Database (CCM) to extract datA from the Compustat Fundamentals Annual table (FUNDA) using permanent identifiers (GVKEY, NPERMNO or NPERMCO).
CCM_XPF/ccmfunda2 *Use the CRSP Compustat Merged Database (CCM) to extract data from the Compustat Fundamentals Annual table (FUNDA) using permanent identifiers (GVKEY, NPERMNO or NPERMCO0.
CCM_XPF/ccmfundacrsp Description Not Available
CCM_XPF/ccmfundq *Use the CRSP Compustat Merged Database (CCM) to extract data from the Compustat Fundamentals Quarterly table (FUNDQ) using permanent identifiers (GVKEY, NPERMNO or NPERMCO).
CCM_XPF/ccmkeys *Find the GVKEYs associated with nonpermanent identifiers (e.g., SMBL or CNUM) using the CRSPCompustat Merged Database (CCM) header tables.
CCM_XPF/ccmsecm Extract data from CRSP/COMPUSTAT Merged Database - Security Monthly (secm).
CCM_XPF/cvccmlnk *This macro creates the CCMXPF_LINKDTABLE data set. It contains data from the Link History table as well as the USEDFLAG variable from the Link Used table.
CCM_XPF/merge_funda_crsp_byccm This Program merges CRSP and Compusat Xpressfeed databases using the CCM product. To be able to run the program, a user should have access to CRSPCompustat Merged database, Compustat Annual Xpressfeed datasets and CRSP monthly database.


Sample code to print out returns on earnings report dates in 2010

/* 
 
 This SAS code extracts three company's daily returns on earnings reports dates in 2010.
 Three companies are Microsoft(MSFT), DELL, and IBM.
 
	PERMNO TICKER DATE(first traded date)
	11081 DELL 19880622
	12490 IBM 19620702
	10107 MSFT 19860313
 
 
 Since the primary key of the COMPUSTAT is the gvkey but not PERMNO, we first need to look up which company in CRSP is mapped to
 another company in COMPUSTAT. CRSP/COMPUSTAT MERGED(CCM) database provides us with the mapping table between PERMNO and GVKEY.
 
 Earning report dates are in COMPUSTAT only.
 Therefore this code will find all earning reports date and gvkey.
 Later stocks belong to this gvkeys will be searched in the mapping table so as to get daily returns on the earnings report date.
 Here earning report date is also called RDQ in COMPUSTAT database.
*/
 
libname own  '/home/purdue/YOURNAMEHERE';
libname A_CCM '/wrds/crsp/sasdata/a_ccm';
 
%let begdate = '01JAN2010'd;
%let enddate = '31DEC2010'd;
 
/* We are not interested in earnings reports data not lying between begin date and end date 
  so filter out them.
 */
DATA ERD;
SET compa.fundq;
keep gvkey rdq;
WHERE  rdq<= &enddate AND rdq>=&begdate;
run;
 
DATA LT;
SET a_ccm.ccmxpf_linktable;
keep gvkey lpermno linkdt linkenddt;
run;
 
/* Keep three company's daily returns for 2010 */
proc SQL;
CREATE VIEW DR AS
SELECT d.permno, d.DATE, d.ret 
FROM crspa.dsf d 
WHERE d.permno IN ( 11081, 12490, 10107) AND
   &begdate <= d.date<= &enddate;
 
 /* The linking between PERMNO and GVKEY has expiration date.
   If the mapping is still valid at the time point of DR.date, use the mapping 
   */
 CREATE TABLE retEPdates
 AS SELECT DISTINCT LT.gvkey, LT.lpermno, DR.DATE, DR.ret, LT.linkdt, LT.linkenddt
 FROM DR, LT 
 WHERE DR.permno=LT.lpermno AND DR.DATE >= LT.linkdt AND 
		((LT.linkenddt =.E ) OR (DR.DATE <=LT.linkenddt))
 ORDER BY LT.lpermno, DR.DATE;
 
/* Filter in daily returns on earning reports dates */
 CREATE TABLE MATCH
 AS SELECT DISTINCT t.gvkey, t.lpermno, t.DATE, t.ret, cd.rdq
 FROM retEPdates t, ERD cd
 WHERE cd.gvkey=t.gvkey AND  t.DATE=cd.rdq
 ORDER BY t.lpermno, t.DATE; 
quit;
 
/* temporary dataset will disappear unless you store it.
   One way to save dataset is to store it under your home directory 
   by specifying libname "own"
 */
DATA own.EPdata;
SET MATCH; 
format DATE date9.;
run;

How to execute this code

wrds(~)% wrds(~)% nohup sas FindEarningRepDates.sas & [1] 28916 wrds(~)% When it is done, it will display completion wrds(~)% [1] Exit 9 sas FindEarningRepDates.sas

Sample output

EPdata.sas7bdat is the final output file. You can open this file from SAS/SAS viewer or you can export to .txt, .csv, or excel files as you wish

gvkey lpermno date ret rdq
012141 10107 28JAN2010 -0.017189 20100128
012141 10107 22APR2010 0.001851 20100422
012141 10107 22JUL2010 0.028662 20100722
012141 10107 28OCT2010 0.008829 20101028
014489 11081 18FEB2010 0.022608 20100218
014489 11081 20MAY2010 -0.044059 20100520
014489 11081 19AUG2010 -0.012305 20100819
014489 11081 18NOV2010 0.023595 20101118
006066 12490 19JAN2010 0.017909 20100119
006066 12490 19APR2010 0.012248 20100419
006066 12490 19JUL2010 0.013747 20100719
006066 12490 18OCT2010 0.012548 20101018

  • Write question/comment here.
    • answer here.

Back to ECE695: "Financial Engineering", Spring 2012

Alumni Liaison

ECE462 Survivor

Seraj Dosenbach