Contents
Overview of CRSP/Compustat Merged Database in WRDS
Retrived from course forum for ECE695: "Financial Engineering", Spring 2012
The CRSP/Compustat Merged Database (CCM) is comprised of CRSP and Compustat data together with the link and link-history references between these two databases. It includes Standard & Poor's Compustat data, reformatted into CRSP's propreitary CRSPAccess database format, plus additional data tables that map the CRSP permanent company and security identifiers (PERMCO and PERMNO, respectively) to the COMPUSTAT permanent company identifier (GVKEY).
CRSP | Compustat |
Covers stock market data on major stock exchanges (NYSE, AMEX, NASDAQ) | Covers basically accounting data for public, OTC and private companies |
Main Identifiers: PERMNO and PERMCO | Main Identifier: GVKEY |
Secondary Identifiers: Header and Historical 8-digit CUSIPs | Secondary Identifers: Header 6-digit CUSIP (named CNUM) |
Other identifiers: Ticker Symbol and Company Names, Header and Historical | Other Identifiers: Header Ticker Symbol and Company Names |
A firm can have mutiple securities | |
Special Compustat records: Red-Herring Companies | |
Data after IPO | In general, Compustat requires that a firm has some number of years of history as public company before including it in the dataset. |
Data since 1925 | Data since 1950 |
More in-depth article is available at here
CCM_LOOKUP | For web query CODE_LOOKUP use only |
CCM_QVARDS | Dummy data set for web query QVARDS use only |
CCMXPF_LINKTABLE | CRSP/COMPUSTAT Merged - Link History w/ Used Flag |
CCMXPF_LNKHIST | Native Link usage provides access to Compustat records, regardless of whether or not securities are in the CRSP universe. |
CCMXPF_LNKRNG | Information on the fiscal periods associated with each used link for each time series calendar frequency and keyset |
CCMXPF_LNKUSED | LINKUSED structure builds a composite Compustat record from one or more Compustat GVKEYs and IIDs linked to a CRSP PERMNO |
COMPHEAD | CRSP/COMPUSTAT Merged - Company Header Information |
COMPHIST | CRSP/COMPUSTAT Merged - Company Header History |
COMPMASTER | Master Defined Structure |
SECHEAD | CRSP/COMPUSTAT Merged - Security Header |
SECHIST | CRSP/COMPUSTAT Merged - Security Header History |
Position | Variable Name | Type | Length | Description |
1 | gvkey | Char | 6 | Standard and Poor\\\'\\\'s Identifier |
2 | linkprim | Char | 1 | Primary Link Marker |
3 | liid | Char | 3 | Security-level Identifer |
4 | linktype | Char | 2 | Link Type Code |
5 | lpermno | Num | 8 | Historical CRSP PERMNO Link to COMPUSTAT Record |
6 | lpermco | Num | 8 | Historical CRSP PERMCO Link to COMPUSTAT Record |
7 | USEDFLAG | Num | 8 | Flag marking whether link is used in building composite record |
8 | linkdt | Num | 8 | First Effective Date of Link |
9 | linkenddt | Num | 8 | Last Effective Date of Link |
Lost of sample SAS programs are given such as
SAS Files | Description |
CCM_XPF/ccm_lnktable | This program shows the main processing part of the CCM web queries. XpressFeed version CCM product provides more detailed information on linking history, in the meanwhile it also populates much more observations than the old version, among which there are some consecutive records (consecutive linking date range) with the same GVKEYIIDPERMNO pair but different LINKTYPE or USEDFLAG or LINKPRIM value. This program will collapse such consecutive rows and combine them into one. |
CCM_XPF/ccmfunda | *Use the CRSP Compustat Merged Database (CCM) to extract datA from the Compustat Fundamentals Annual table (FUNDA) using permanent identifiers (GVKEY, NPERMNO or NPERMCO). |
CCM_XPF/ccmfunda2 | *Use the CRSP Compustat Merged Database (CCM) to extract data from the Compustat Fundamentals Annual table (FUNDA) using permanent identifiers (GVKEY, NPERMNO or NPERMCO0. |
CCM_XPF/ccmfundacrsp | Description Not Available |
CCM_XPF/ccmfundq | *Use the CRSP Compustat Merged Database (CCM) to extract data from the Compustat Fundamentals Quarterly table (FUNDQ) using permanent identifiers (GVKEY, NPERMNO or NPERMCO). |
CCM_XPF/ccmkeys | *Find the GVKEYs associated with nonpermanent identifiers (e.g., SMBL or CNUM) using the CRSPCompustat Merged Database (CCM) header tables. |
CCM_XPF/ccmsecm | Extract data from CRSP/COMPUSTAT Merged Database - Security Monthly (secm). |
CCM_XPF/cvccmlnk | *This macro creates the CCMXPF_LINKDTABLE data set. It contains data from the Link History table as well as the USEDFLAG variable from the Link Used table. |
CCM_XPF/merge_funda_crsp_byccm | This Program merges CRSP and Compusat Xpressfeed databases using the CCM product. To be able to run the program, a user should have access to CRSPCompustat Merged database, Compustat Annual Xpressfeed datasets and CRSP monthly database. |
Sample code to print out returns on earnings report dates in 2010
/* This SAS code extracts three company's daily returns on earnings reports dates in 2010. Three companies are Microsoft(MSFT), DELL, and IBM. PERMNO TICKER DATE(first traded date) 11081 DELL 19880622 12490 IBM 19620702 10107 MSFT 19860313 Since the primary key of the COMPUSTAT is the gvkey but not PERMNO, we first need to look up which company in CRSP is mapped to another company in COMPUSTAT. CRSP/COMPUSTAT MERGED(CCM) database provides us with the mapping table between PERMNO and GVKEY. Earning report dates are in COMPUSTAT only. Therefore this code will find all earning reports date and gvkey. Later stocks belong to this gvkeys will be searched in the mapping table so as to get daily returns on the earnings report date. Here earning report date is also called RDQ in COMPUSTAT database. */ libname own '/home/purdue/YOURNAMEHERE'; libname A_CCM '/wrds/crsp/sasdata/a_ccm'; %let begdate = '01JAN2010'd; %let enddate = '31DEC2010'd; /* We are not interested in earnings reports data not lying between begin date and end date so filter out them. */ DATA ERD; SET compa.fundq; keep gvkey rdq; WHERE rdq<= &enddate AND rdq>=&begdate; run; DATA LT; SET a_ccm.ccmxpf_linktable; keep gvkey lpermno linkdt linkenddt; run; /* Keep three company's daily returns for 2010 */ proc SQL; CREATE VIEW DR AS SELECT d.permno, d.DATE, d.ret FROM crspa.dsf d WHERE d.permno IN ( 11081, 12490, 10107) AND &begdate <= d.date<= &enddate; /* The linking between PERMNO and GVKEY has expiration date. If the mapping is still valid at the time point of DR.date, use the mapping */ CREATE TABLE retEPdates AS SELECT DISTINCT LT.gvkey, LT.lpermno, DR.DATE, DR.ret, LT.linkdt, LT.linkenddt FROM DR, LT WHERE DR.permno=LT.lpermno AND DR.DATE >= LT.linkdt AND ((LT.linkenddt =.E ) OR (DR.DATE <=LT.linkenddt)) ORDER BY LT.lpermno, DR.DATE; /* Filter in daily returns on earning reports dates */ CREATE TABLE MATCH AS SELECT DISTINCT t.gvkey, t.lpermno, t.DATE, t.ret, cd.rdq FROM retEPdates t, ERD cd WHERE cd.gvkey=t.gvkey AND t.DATE=cd.rdq ORDER BY t.lpermno, t.DATE; quit; /* temporary dataset will disappear unless you store it. One way to save dataset is to store it under your home directory by specifying libname "own" */ DATA own.EPdata; SET MATCH; format DATE date9.; run;
How to execute this code
wrds(~)%
wrds(~)% nohup sas FindEarningRepDates.sas &
[1] 28916
wrds(~)%
When it is done, it will display completion
wrds(~)%
[1] Exit 9 sas FindEarningRepDates.sas
Sample output
EPdata.sas7bdat is the final output file. You can open this file from SAS/SAS viewer or you can export to .txt, .csv, or excel files as you wish
gvkey | lpermno | date | ret | rdq |
012141 | 10107 | 28JAN2010 | -0.017189 | 20100128 |
012141 | 10107 | 22APR2010 | 0.001851 | 20100422 |
012141 | 10107 | 22JUL2010 | 0.028662 | 20100722 |
012141 | 10107 | 28OCT2010 | 0.008829 | 20101028 |
014489 | 11081 | 18FEB2010 | 0.022608 | 20100218 |
014489 | 11081 | 20MAY2010 | -0.044059 | 20100520 |
014489 | 11081 | 19AUG2010 | -0.012305 | 20100819 |
014489 | 11081 | 18NOV2010 | 0.023595 | 20101118 |
006066 | 12490 | 19JAN2010 | 0.017909 | 20100119 |
006066 | 12490 | 19APR2010 | 0.012248 | 20100419 |
006066 | 12490 | 19JUL2010 | 0.013747 | 20100719 |
006066 | 12490 | 18OCT2010 | 0.012548 | 20101018 |
- Write question/comment here.
- answer here.