Stata最有用的points都在这里,无可替代的材料
小猪(计量经济圈)

欢迎投稿(荐稿)计量经济圈,计量相关都行
邮箱:econometrics666@sina.cn
计量经济圈(ID: econometrics666);需要打开里面蓝颜色链接获得资料的请直接进入咱们社群,我们会统一发放这个资料。

FILE MANAGEMENT//文档管理
Gentzkowand Shapiro (2014) “Code and Data for the Social Sciences: A Practitioner’sGuide.” - I strongly recommend reading this before embarking on yourvery first empirical research project. The guide introduces you to a lot ofuseful concepts of data management developed in computer science, which willsave tons of time during an increasingly long journey of conducting a piece ofempirical research in economics. The most important are Chapters 2, 4 and 5,which help you organize your data files and millionsof your Stata dofiles(no joking, by the time you publish your empirical paper, you willhave tons of Stata codes).
TUTORIALS//教学
Essamand Hughes (2016)Stata Cheetsheets--- All the importantStata commands at one glance.(HT:MarcBellemare)
Lembcke(2009) “Introduction to Stata” and “Advanced Stata Topics”--- These are the Statacourse lecture notes for PhD students at the Department of Economics, LSE.Since 2004, each year’s course instructor has updated and expanded them. I tookthe course in 2004, but the current version of the lecture note is much morethan what I learned at the course. You will learn a lot from this. Inparticular, “Advanced Stata Topics” touches on how to writeand publish your own Stata programme, maximum likelihood estimation in Stata,and how to use Mata (Stata’s matrix programming language), the topics that areusually not covered in a Stata course for economists.
Using Stata to Analyze Survey Data by Nicholas Minot (IFPRI):This is an excellent introduction to Stata specifically tailored for would-bedevelopment economists.
Maybeuseful:
A. Colin Cameron and P. K. TrivediMicroeconometrics: Methods and Applications
GermánRodriguez “Stata Tutorial”Princeton University
Phil Bardsley, Kim Chantala, and Dan Blanchette "Stata Tutorial"University of NorthCarolina at Chapel Hill
StataStarter Kitby UCLA Academic Technology Service
INTRODUCTION//引言
What Stata can/can't doby A. Colin Cameron(Dept. of Economics, University of California, Davis)
ADO FILES//自动程序
Toinstall an ado file, type "ssc install xxx" (where xxx should bereplaced with the name of the ado file) in your Stata interactive session.
DO FILES//do文档
Makingdo-files is essential because it allows other researchers to replicate yourempirical analysis. It's increasingly become the norm among empiricalresearchers to make public on the website Stata do-files used to produceresults in published papers. Here are some websites on how to make do-files.
MichaelS. Hill (2015) "In Stata coding, Style is the Essential: A brief commentaryon do-file style"
Stata Tutorial by Carolina Population Center, University ofSouth Carolina
AnIntroduction to Stata by Aimee Chin at MIT
Stata section of Guide to Genetic Analysis by Centre forIntegrated Genomic Medical Research(Links to example do-filesare dead, but it contains some information on editor software.)
Using external text editors to write do filesbyFriedrich Huebler
RA Manual Notes on Writing Code, by MatthewGentzkow and Jesse M. Shapiro (2012), offer the best practices in computerprogramming that are useful for writing Stata do files (and scripts for othersoftware).
Stata helpfor timer: A useful command if you run a do file that contains acommand to take very long to be executed (e.g. regression with a lot of fixedeffects).
Ifyou use Stata/MP on cluster computing facilities, seeStata Help:statampif you use Stata/MP on cluster computing facilities.
READING FILES//阅读文件
Everydata analysis begins with opening a data file. First, look atthis websitefor jargons for data formats.(The description on rectangular files is wrong, though.)
StataHelp infiling: Official guide on which command to use for readingdifferent types of data.
Excel
Excelfiles can finally be imported by a Stata command:importexcel.
Forearlier versions of Stata to read an Excel file, followthis blog entry. Make sure to use the forwardslash (/) rather than the backslash (\) for the path name. It should thenwork.
Stata
Thereisa useful ado program named USE10whichallows you to read the Stata version 10 data with Stata version 9. Type “sscinstall use10” to install it.
SPSS
Toread SPSS data files, use theusespssado. (HT:David McKenzie.)
CSV
Ifeach data entry is separated by a comma (called the CSV format), use INSHEET.
Ifyour data includes an identification number with more than 7 digits, make sureyou include thedoubleoption to theinsheetcommand.ReadStata Help for data_typefor details.
Tab-delimited
Ifthe separater is a tab or a space, use INFILE.
Fixed format
Ifthe data file is in the fixed format (no separater between data entries;entries are identified by column numbers), it's more tricky. There are threecases:
(1)If it's a flat file (each single line represents one observation), seeStata: How to Write a Dictionary Program to Read Raw Databythe Electronic Data Service (EDS) at Columbia University;
(2)If it's a rectangular file (the fixed number of lines represent oneobservation), see "Example of a Program to Read Data with MultipleRecords/Case" at the bottom ofStata: How to Write a Dictionary Program to Read Raw Databythe Electronic Data Service (EDS) at Columbia University;
(3)If it's a hierarchical file (a flexible number of lines represent oneobservation such asWorld Fertility Surveys), seeStata: How to Read Hierarchical Files in Statabythe Electronic Data Service (EDS) at Columbia University.
From scratch
Tocreate a dataset from scratch, first type “drop _all” and then type “set obs #”where # is the number of observations in this new dataset. Then createvariables by the generate command etc. For a small dataset, you can usethe INPUTcommandto directly enter the data.
Multiple files in the same directory
To read many files in the same directory and append them all,seeAppend Many Filesby UCLA.
EDITING DATA STRUCTURE//编辑数据结构
Beforestarting to edit data itself, you need to edit the structure of data files:reshape, append, and merge.
RESHAPE:Whenever you use the datasets downloaded fromWorld DevelopmentIndicators, you need to do this.
Using Stata's RESHAPE command, by Amy Yuen atElectronic Data Center of Emory University General Libraries
APPEND/MERGE:Good empirical research often relies on the use of two or more completelydifferent datasets. So you need to append or merge different datasets beforestarting analysis.
ISID:When you want to merge two datasets which do not share the common uniqueidentifier but do share the same variables (e.g. birth date, birth region), theISID command lets you check if a certain set of variables uniquely identifyobservations. SeeStata Help on ISID.
Stata Tutorial Part 4: Manipulating Files, bySyracuse University Library
DATA PROCESSING//数据处理
How tocreate dummy variablesbyStata FAQs
Create a new dataset by handby CarolinaPopulation Centre, University of North Carolina
Listofmath functionsby Stata Help- can be usedin combination ofgeneratecommand to edit variables.
List ofoperatorsby Stata Help
Date variablesby Data and StatisticalServices, Princeton University --- This webpage tells you how to convert datevariables into different formats (e.g. convert the variables of year, month,and day into one date variable etc.).
Tocategorize observations by percentile bins, use the commandxtile.Seethis Statalist message.
UNIQUE: Stata module to report number of unique values in variable(s)---Sometimes this ado command is useful. For example, you may want to know whethera particular variable takes more than one value for each group of observations.To see the detail, type “ssc install unique” to install the ado file and thentype “help unique” for its help.
REGEXM:useful if you want to identify observations whose string variable contains aparticular set of letters.
Loop over all values of a particular variable:there is a lesser-known command LEVELSOF, creating a local macro r(levels)which contains the list of all values of the specified variable.
SUMMARY STATISTICS//描述统计
ESTPOST-This is part of the ESTOUT ado file package, automatizing the process ofcreating a table of summary statistics. Highly recommended.
Section6 (pages 33-43) ofUsing Stata for Survey Data Analysisby NickMinot at IFPRI --- Very useful, especially if you are analyzing householdsurvey data.
How to conduct a t-test for survey data, by UCLAAcademic Technology Service --- Useful if each observation in your data needsto be weighted according to the sampling method. See alsohow to use the SVY command.
Generating Regression and Summary Statistics Tables inStata: A Checklist and Code, by Matthew Groh (May, 2014) ---Provides an example do file that uses theMAT2TXTStata module.
ESTIMATIONS//估计
Overview of Stata estimation commands
Stata Textbook Examples: Econometric Analysis of CrossSection and Panel Data by Jeffrey M. Wooldridge, by UCLA AcademicTechnology Service --- First, find an example of the estimation method you wantto conduct in Wooldridge's graduate econometrics textbook. Then log on to thiswebpage to see what Stata command does the estimation you want.
Beyond simple OLS estimationby UCLAAcademic Technology Service -robustestimation,clustering,quantileregression,linear hypothesis testing,errors-in-variablesregression(eivreg),censored/truncateddata,SUR,multivariateregression, etc.
Fixed effects estimation
TheXTREG command with the FE option (ie. fixed effects estimation) has recentlybeen modified. See what’s newinStata 10(items 4, 5, and 7 in particular) andinStata 11(the fourth bullet point in particular).
Fixed Effects Estimation (xtregcommandwithfeoption)by Stata FAQ - explains whythere is a constant term in the estimation result table.
Differences among within, between, and overall R-squaredobtained by thextreg, fecommandby JustinSmith (15 August 2006)
R squared in Fixed Effects Estimationby Stata FAQ -explains why reported R squared is different betweenxtreg, feandareg.See alsothis noteby Indiana University Information TechnologyServices. Theoretical background can be found in Hayashi's Econometricstextbook (page 333-4), for example. (This issue seems to be outdated with thextreg command improved by Stata version 10 or higher.)
Ifyou notice the areg command the xtreg command with the fe option producedifferent clustered standard errors from each other,read this.
Prais-Winstenpanel regression: usethe XTPCSE command. Examples includeRohlfset al (2010).
Weighted least squares estimation
WeightedLeast Squares when the variance of the error term is knownbyStata Help
Choosing the Correct Weight Syntaxby UNCCarolina Population Center - if you wonder whatpweight,fweight, aweight, and iweightreally mean.
Weighted Least SquaresRegressionbyUCLA Academic Technology Service (See Deaton (1997)The Analysis ofHousehold Surveys, pp.67-73, for the use of weighted least squares in thecontext of survey design.)
probit, logit, and other nonlinear regressions
MARGINS:a new command introduced since version 12, to report the average value of thepredicted dependent variable by each specified value of regressors (if Iunderstand corectly). Useful for interpreting estimated coefficients fromnonlinear regressions,as explained by SSCC at University of Wisconsin-Madison.
INTEFF: this is an ado package to correctlyestimate the magnitude and standard errors of the effect of an interaction termin nonlinear models such as probit and logit. SeeAi and Norton (2003)for detail. Thiscommand, however, does not work if there are quite a few dummy variables asregressors. It seems the MARGINS and MARGINSPLOT commands supercede the INTEFF.
Event study
How to conduct anevent studyestimationwith Stataby Data and Statistical Services, PrincetonUniversity
Attrition bias
Lee(2009)’s treatment effects bounds. In the case of attrition bias, this methodis now the industry standard. Now you can easily do it in Stata withthe leebounds command.New
Standard errors
Bootstrapping:See Lecture 4 (pages 6-8) inProgramminginStata, RLAB Data Service, London School of Economics.
X_OLS: Timothy Conley's standard error correctionfor spatial correlation. This is the standard way of calculating standarderrors in the literature when you use the data where outcomes and regressorsare spatially correlated.
Douglas Miller’s Stata code pagecontains aStata do file to execute Cameron, Gelbach, and Miller (2008)’s Wild Bootstrapstandard error clustering method, which is increasingly popular among appliedmicroeconometric researchers when the number of clusters is small.
Matching estimation//匹配估计
CEM: CoarsenedExact Matching, by Iacus, King, and Porro (2008), for creating acontrol group whose observables are balanced against the treated group ex ante.Used byAzoulay, Zivin, and Wang (2010).
Matching Estimators ado fileby Abadie,Drukker, Herr, and Imbens
Synthby Abadie, Diamond, and Hainmueller--- A method to estimate the treatment effect from observational data when onlyone unit is treated.
Pair-wiseMahalanobis matching with an optimal greedy algorithm: See page 209 ofBruhn andMcKenzie (2009). This article’s replication data file (click“Download Data Set” onthis webpage), contains a Stata code for thismatching method.
AFTER EACH REGRESSION IS RUN...//回归之后
Howto interpret output tables that appear after executing estimation commmandssuch as summarize, regress, logistic, etc.by UCLA AcademicTechnology Service
reformatado-file, by Sealed Envelop Ltd. - Thisado-file is useful when you have tons of fixed-effects (e.g. country dummies)and are interested in coefficients on these dummies.
StataClass 3, by Stas Kolenikov, Duke University - introduces commandsafter estimation forplotting residualsetc.
Fromversion 10, you can save estimation results in the disk bythecommand estimates save. As a result, theESTSAVEado is no longer necessary to install.
parmest ado-fileallows you to create aStata data file of coefficient estimates along with t-values and p-values. Bydefault, Stata does not store t-values and p-values after regressions. Thisado-file is useful if you need to use t-values and/or p-values after eachregression is run.
REPORT ESTIMATION RESULTS//汇报估计结果
ESTOUT-A great ado-file package to create a table of regression results either in thetext file format, in the HTML format, or in the TeX format! It's more versatilethan OUTREG2 (see below). It is slightly complicated but it's worth paying thefixed cost of learning how to use. To minimize the fixed cost, follow thefollowing steps:
Toinstall the package,see here.
First,learn how to use ESTSTO byreadingthis.
Then,learn how to use ESTTAB byreadingthis.
Onlyfor fancier things to do, you need to learnESTOUT(themore flexible version of ESTTAB) andESTADD(themore flexible version of ESTSTO's ADDSCALARS option).
Withthe ESTOUT package, you can easilycreate a summary statistics table!
TheESTOUT package also allows you to include "YES" or "NO" toindicate whether a certain set of fixed effects are controlled for (a standardpractice in labor economics type research).See this document.
Generating Regression and Summary Statistics Tables inStata: A Checklist and Code, by Matthew Groh (May, 2014) --- If youprefer creating regression tables in the Excel format.
TABOUT- Seems to be a very useful ado forautomating the process of creating any kinds of tables formatted to appear onan academic paper. Example Stata do files mentioned in this tutorial can be downloadedatthe author’s website.
OUTREG2.ado- An improved version of OUTREG.ado (seebelow). It's less versatile than ESTOUT, but it's more flexible in producing aTeX file. One problem is that, after fixed effects estimation (areg or xtreg,fe), the nocons option does not work.
How to useoutreg.ado, by KelloggResearch Computing, Northwestern University - probably the most usefulexplanation of outreg ado file, includingthe PDF file of outreg help file. When you wantto useaddstatoption for reporting more than 10 statistics,outreg does not work properly. A solution can be foundhere(Statalist archives). (If you want tofurther convert the resulting EXCEL file into a LaTeX format, downloadEXCEL2LATEXhere and extract the downloadedzip file into "C:\Documents and Settings\username\ApplicationData\Microsoft\AddIns" (where "username" is your own username).Then open the Excel and click "Tools - Add-Ins..." and check the boxfor Excel2Latex. You'll see a new small icon in tool bars. Select the table youwant to convert and then click the icon. Now you can create a TeX file of yourtable.)
How to report multinominal logit regression results withOUTREG, by Statalist
GRAPHICS//绘图
Online Tutorial for Making Graphsby StataCorp. - An excellent website in the sense that you can choose the visual image(rather than picking the words like “bar graphs”, “scatter plots”, etc.) tolearn how to make various types of graph.
Howto make various types ofgraph (Follow linksbelow the heading of "Graphics") by UCLA Academic Technology Service- Useful if you want to make the twoway graphs.
BYoption for GRAPH command by Stata Help- this is how to makegraphs for each category (e.g. country by country).
BINSCATTER-A Stata package written by Michael Stepner, which allows you to create ascatter plot from (literally) millions of observations, by groupingobservations into several intervals of the x variable and plotting the averagevalue of the y variable for each group. (HT: David Seim)
Nonparametricregressioncurve in a scatter plot- search for "nonparametric".
Draw kernel density functions for each group in the samegraphby UCLA Academic Technology Service
Guide to creating PNG images with StatabyFriedrich Huebler
How to create animated graphics using Stata, byChuck Huber.
How to create a map from Stataby FriedrichHuebler
Drawing social networks in Stata with NetplotbyRense Corten --- if you are analyzing social network data.
PROGRAMMING//编程
ProgramminginStata, RLAB Data Service, London School of Economics: these arelecture notes for a Stata course at Department of Economics, LSE. Lectures 3 to5 deal with how to make your own program with Stata (macro,looping,ado-file,etc.). Very useful.
Howto display variable labels: Seethis Statalist message by Nick Cox on 27 May, 2010.
TheCAPTURE command is useful when executing a do file, especially when you want toconduct different data processing steps depending on whether there is an error(which can be expressed as “if _rc==0” in the Stata code). See the paragraphsbelow the heading “If as a Way to Control Program Flow” inthiswebpage.
How do I run Stata in batch mode? (Stata FAQ): ifyou want to run a do file without launching Stata interactively in Unix
TROUBLESHOOTING//问题处理
Ifyou always type “set memory 900m” after launching Stata because you use a largedataset, readthis.
Ifyou run Stata on Windows and encounter an error message "op. sys. refusesto provide memory, r(909)", you may want to consider ditchingWindows.Here's why.
Ifyou encounter an error message "insufficient disk space, r(699)",seethis Stata FAQ article.
Ifyou encounter a warning message “Warning: variance matrix is nonsymmetric orhighly singular”, seethis post in Statalist by Jeff Pitblado of Stata Corp.
If youencounter an error message “could not rename c:\ado\plus\stata.trk toc:\ado\plus\backup.trk r(699);” when you try to install an ado file by the “sscinstall” command, read pages 47-48 of Lembcke (2009) “Introduction to Stata”. Unfortunately, thismethod does not change the Stata setting permanently. Everytime you use an adofile, you have to do this.
FROM STATA TO OTHER SOFTWARE//各软件的交互
Export tables to Excel, written by Kevin Crow onThe Stata Blog.
How to transform dta file into csv file, by UCLAAcademic Technology Service. If data contains many decimal places, make sure touse the format command before the outsheet command so that Stata won’t randomlyround up values. If you don’t need the top row containing variable names, usethe noname option.
Ordercommandby Stata Help - if you want to change the order ofvariables in the table you create from the Stata dataset.
How to edit Stata graphs in Microsoft Word, byStata FAQ
Stata tools forLatex, by UCLAAcademic Technology Service - for those of you who write empirical papers withLaTeX.
TEXTBOOK EXAMPLES//书籍示例
Stata commands for examples inWooldridge's graduateleveltextbookEconometric Analysis of Cross Section and Panel Data,by UCLA Academic Technology Service
Stata commands for examples inWooldridge'sundergradlevel textbookIntroductory Econometrics: A ModernApproach, by Boston College Academic Technology Support
Stata commands forGreene's textbookEconometricAnalysis(4th ed.), by UCLA Academic Technology Service
Accessible readings behind Stata commands//程序关联的相关文章
IVREG2
Murray,Michael P. (2006) "Avoiding Invalid Instruments and Coping with WeakInstruments,"Journal of Economic Perspectives, 20(4), p. 128.
CLUSTERoptionforREGRESS
Deaton(1997)The Analysis of Household Surveys, pp.74-77.
Bertrandet al. (2004) "How Much Should We Trust Differences-in-differencesEstimates?,"Quarterly Journal of Economics, vol.119, p.271.
KDENSITY
Deaton(1997)The Analysis of Household Surveys, pp.171-76.
The following websites may or may not be useful (Ihaven't checked them yet):
Tips forusing Stata 10, by Survey Design and Analysis Services Pty Ltd
Roger Newson's Stata ado files
Useful Linksby Kellogg Research Computing,Northwestern University
Stata materialsbyStas Kolenikov, Duke University - includes very graphically well-presentedStata course notes.
Stataado files by Sealed Envelope Ltd.
Stata Tutorial at University of Essex
EszterHargittai's Stata Goodies Page
Stataresources developed by Johannes Schmieder


最新讨论 ( 更多 )
- 截面, 时间和面板的门槛回归模型, threshold (小猪)
- 顶级期刊目录及历史文章 (小猪)
- 发表Top5刊的500强名单出炉, 这几位中国人实至名归 (小猪)
- 最近几年国外国际贸易学术研究前沿 (小猪)
- 实践中双重差分法DID暗含的假设 (小猪)