STAT 482/682 Assignments FAQ

[display all FAQ answers]


Where did you get the benchmark data for the Dow and the Russell 2000/3000? We could not locate the data on CRSP. I have data downloaded from Bloomberg that is similar to the benchmark, however the data runs from (12/31-12/31) rather than (1/01-1/01), so I was wondering where you got Dow and Russell so I can calibrate exactly to your data.

I wonder whether the Dow Jones annual return data come from WRDS. I wasn't able to find it under either CRSP or Compustat. If it's not, could you please point out a direction of where we should be looking for the data? You have suggested us to go to the Dow Jones official site for total return data. The Dow Jones Total Return on the site is only available in daily, not yearly, and it only goes back to 1987. See

We are having difficulty finding data for Russell and NASDAQ as well... It seems the only place that provides market weighted vs value weighted data is WRDS. Yet we could not find any data of Russell or NASDAQ.

Additionally, how do we normalize data for the year, if we are able to get it.

I've been exploring WRDS but with the exception of NYSE/AMEX, I can't find any information summarized by index with which to construct benchmarks. Do you intend for us to identify the membership of each index by year (86 years), look up the permno's for each stock, and pull this information from CRSP for each of the DJIA, SP500, NYSE/AMEX, RUT3000, RUT2000? If that is your intention, can you please tell me if you will be assigning this amount of work for each of the homeworks? We did something similar for just the 10 dogs of the Dow over 30 years in Stat 686. It took a very long time and was considered a project level undertaking. I've added 482 on top of an already heavy load because I'm interested in the topic. So I'm eager to know if I should be auditing the class instead of taking it.

How do we get the master list of "permno" of each company we need the data for. I was thinking of taking the list of "permno" and putting it in the CRSP/CRSP-Merged to get the annual variables for each of the companies(public and private).

Public vs. Private Companies on CRSP, Compustat, and CRSP/Compustat Merged

First, we noticed in one of the great ones that they excluded OTC stocks from their public data set. However, aren't companies traded OTC still technically public and should therefore be included?

When we query Compustat for all companies, we get ~10,000 observations (only ~300 are private when we look at exchange codes of 0,1 and 3), and when we query CRSP-Compustat for all companies, we get ~5300 observations (none of which are private.)

Fiscal Year Analysis

The instructions say to find the count and proportions of companies with market cap over the median and mean for each year. However, by definition, exactly half of the companies are going to have market cap over the median each year. Can you please clarify this instruction?

We were wondering what sort of 'conclusions' could be made in relation to the coverage analysis. Are you looking for general comparisons between coverage of publicly and privately traded companies? Or rather, an examination between possible differences between coverage of different privately traded companies and then the same for publicly traded companies? (Or both?) Do you also want us to include a discussion about how CRISP/Compustat collects data (such as linking options as seen in WRDS) as it relates to coverage?

What is the difference between the FYEAR and the DATADATE. When we extract the year from the datadate, it is not always the same as the fyear. Why is this? When examining across time periods, which should we use?

How are Fiscal Year End and Current Fiscal Year End Month distinct?

In the assignment it says that we should expect a few hundred private companies per year, but according to our data we are getting less than 10 per year. When I downloaded the data from WRDS, I kept most of the defaults (except for the date variable option). Should we change one of these options? Or is there another exchange code that is private?

How do we calculate the returns from the data? We are thinking we would calculate the Price adjusted and calculate the yearly return as: Yearly Return=[Price Adjusted(t)-Price Adjusted(t-1)]/(Price Adjusted(t-1)

For 682 pt a: Could you elaborate on what you meant by "bootstrap in order to determine the standard errors of the median"? Do you mean for us to perform random sampling with replacement on the returns data and then apply the 'Studentized Bootstrap'?

For 682 pt b: Would you recommend for us to pull data comparing how filing times have fluctuated across different 10-year spans in order to affirm or reject the null hypothesis. For example if we intended to look at the period for 2006-2016 and found that filing times did improve, should we also look at the period from 1995-2005 or 2000-2010 to determine if the changes from 2006-2016 were significant?

How do I compare distributions?

Financial Data Distribution I

The instructions use the abbreviations TA and CA in various definitions; however, based on Chapter 4 in QFA and the CRSP/Compustat terms, it seems as though ACT (Current Assets - Total) works in place of both TA and CA. Am I understanding this correctly?

Do you want three separate portfolios based on each variable or can we do one portfolio based on a combination of our three variables? We have already constructed a portfolio that ranks our companies by each variable at different levels. For instance, we used market cap as our primary variable, then sorted by dividend yield as our secondary variable, and then sorted by price/sales as our third variable. Furthermore, we discussed why we ranked each variables in this way.

WRDS/CRSP Variables

In the description of HW #3 you say that market cap should be readily available however I am not seeing it on Crsp merged database. I have found a variable called "Market Value" but I do not think this is what I want. If I cannot find the variable should I just multiply "common shares outstanding" by "closing annual price"? I am reluctant to do this though because you said to avoid using price data in our calculation.

Regarding CFO, "CFOPS" does not appear in the CCM variable list. What should we use for this?

Regarding CFO, we used XOPR, would that be best?

The data is improperly formatted. The data entered for fiscal year end and actual filing data is not recognized as a date. Because of this the functions that calculate the amount of days in between 2 dates do not work. There are about 2000 observations for each of the 16 years, thus I think it's a little unreasonable for me to go through and manually calculate this for each observation. I'm not really sure what I should do.

WDB Chapter 4 problems

In the given income statement for problem 4.1, there is a balance for "Operating Depreciation" that is listed just below Cost of Goods Sold, and is thus incorporated into the listed Gross Income (which is before SG&A expenses). Is this depreciation the classic depreciation expense that is associated with Plant, Property, and Equipment? Or is this referring to depreciation of currently held inventory? Because we are requested to calculate some fixed asset ratios that ask for annual depreciation, we assumed that the listed "Operating Depreciation" refers to Plant, Property, and Equipment. However, I was taught this kind of depreciation is listed outside of the Gross Profit, which makes me think the depreciation applies to Inventory, not PP&E. Can you please clarify what this "Operating Depreciation" applies to?

In 4.3, first, the only way we can interpret the problem so that it makes sense is that the numbers listed are in thousands, so 50,000 means 50 million. Please let us know if this is not the case. For part b, we are required to reconcile the firm's tax payment. However, even after adjustments, to me it seems like the company has a negative tax liability, meaning it won't have a tax payment and would be adding a deferred tax asset. But I am not so sure the book intends us to get that deep into the accounting. So would demonstration that the company owes 0 taxes a sufficient answer? Alternatively, for part a, the problem mentions copyrights owned by the company are conservatively appraised at $100 million, while listed on the balance sheet at $10 million. Because the problem asks for adjusted financial statements, to me this means we increase assets for the full value, and add a corresponding increase to owners equity. If, however, this increase is not supposed to affect owners equity and instead be recogniz

Volatility Deciling

What kind of returns are we using? Are we pulling these directly from CRSP database or are we calculating these? CRSP has Equal-Weighted and Value-Weighted Returns. Which should we use?

When calculating annualized historical volatilities for all CRSP stocks by year for 1980-present, are we calculating the standard deviation using all DAILY returns in each year, and then get annualized volatility for EACH year for EACH stock? If yes, from what I collected from CRSP, I got over 1.5GB of data. Or am I not obtaining the right data? Or are we using the ANNUAL returns throughout this 1980-2017 time period to obtain the standard deviation, and then get the annualized volatility for EACH year for EACH stock?

Are we using the NYSE/AMEX/NASDAQ file or individual NYSE, AMEX, NASDAQ files for the calculation of individual stock volatilities?


We are supposed to filter by market cap and trading volume. The trading volume is not in CCM, but it is in CRSP, in both the daily and monthly files. Do you want us to use the monthly stock file, then multiply price by volume, divide by 30, and then merge back into our data from CCM?


Which one exactly is the result we are supposed to beat? The "value decile intersection, 50 stocks" returns, or what?

Do we have to beat him on total return since 1963? Or by 10-yr returns, (average or compound)? Sharpe ratio?

Do we have any restriction on the number of stocks in the portfolio?

Are we rebalancing on an annual basis?


The CAPM homework asks us to select 20 stocks from the Table D-1 ("Select 20 stocks from the list in QFA Table D-1"). However, Table D.1 on pg 585 is the Most Popular Stocks of 2017 and includes stocks like Alibaba, Amazon, Apple, Chipotle, Facebook, etc. Given the analysis is supposed to be from 1970-Present, is this the correct list?

Are we required to use Carhart's 4-Factor model as well, or just CAPM and Fama-French 3-Factor? You mention the 4-factor model in the assignment, but I couldn't tell if it was a suggestion for extra work or a requirement of the homework. 

What are we supposed to use as the risk free rate for the stocks?

Final project

Where is the assignment posted?

How much flexibility do we have in designing our project?

Which one exactly is the result we are supposed to beat? The "value decile intersection, 50 stocks" returns, or what?

Do we have to beat him on total return since 1963? Or by 10-yr returns, (average or compound)? Sharp ratio?

Do we have any restriction on the number of stocks in the portfolio?

Are we rebalancing on an annual basis?

Course webpage