************************************************** . * * Stata syntax to construct a 1998-2004 panel dataset in wide or long form . * Version 1.2 (23/01/08) . * Revised following release of Version 3 of the 2004 panel data file. * * Requires Stata version 7.0 or later, as it creates some variables with names of 9+ characters. * * Compiled by WERS 2004 Information and Advice Service (www.wers2004.info) . * ************************************************** . * Construct a panel dataset in wide form . * Notes: * 1. Variables containing 1998 data are prefixed with the letter X; those containing 2004 data are prefixed with the letter Y. * 2. The 2004 variables Xsample, Xoutcome, Xout3 and Xemps2004 are renamed to Ysample, Youtcome, Yout3 and Yemps2004 . * 3. Other variables already beginning with the letter 'x' receive the X or Y prefix as appropriate, e.g. Xxcode1, Yxcode1, Yxbtitle . * 4. The unique workplace id (serno) and the weighting variables from the 2004 data file (estwtnr, empwtnr, pqwtnr) do not receive a prefix . * 5. The original 1998 weight variables (est_wt, grosswt and emp_wt) are dropped since they have been superceded by estwtnr and empwtnr . * 6. Missing values in 1998 are recoded so that missings on almost all variables are consistently coded to -1 (Not answered) and -9 (DK/NA) . ************************************************** . * Step 0: Ensure that the 2004 panel data file is sorted by serno use "X:\your_path\ps9804_pq04v3.dta" sort serno save "X:\your_path\ps9804_pq04v3.dta", replace ************************************************** . * Step 1: Read in (and fix) 1998 data, sort by serno, link files, add prefixes and drop unwanted variables . clear set more off set mem 10000 use "X:\your_path\mq98fin.dta", clear do "X:\your_path\Mq98fix.do" label variable btitle2 "Job title of main management respondent (coded from verbatim)" drop est_wt grosswt emp_wt foreach varname of varlist xcode1-norgsize { rename `varname' x`varname' } sort serno merge serno using "X:\your_path\ps9804_pq04v3.dta" drop dtwoway _merge rename xsample ysample rename xoutcome youtcome rename xout3 yout3 rename xemps2004 yemps2004 rename asic80 yasic80 rename asic92 yasic92 rename asic2003 yasic2003 rename gor ygor rename ssr yssr foreach varname of varlist xcode12-xcode11 { rename `varname' y`varname' } * Step 2: Enforce consistency in missing values . * This step is omitted in Stata as there is no simple recode available * Step 3: Enforce consistency in variable names and derivations . * Note that the dataset no longer matches the questionnaire in these instances . rename yzunimem yztu_mem rename yzunipc yztu_pc rename yzadm_tot yzcle_tot rename yzskl_tot yzcrt_tot rename yzper_tot yzptc_tot rename ysicode yasic rename yxbtitle ybtitle2 rename yenumrec yetotrec gen yzscode=int(ysoc90/10) recode ybtitle2 1=2 2=1 3=4 4=3 95=7 label define jobtit 1 "Human Resources Manager / Officer" 2 "Personnel Manager / Officer" 3 "Employee ~ Industrial ~ Staff ~ Relations Manager / Officer" 4 "Proprietor / Owner / Managing Director / Partner" 5 "Financial Manager / Company Secretary" 6 "General Manager" 7 "Other specific answer, not codeable to 1-6" label values ybtitle2 jobtit recode ydjoint 2=1 3=2 gen xjtimear7=. label variable xjtimear7 "Do you have any of the following working time arrangements for any non-managerial employees?" label values xjtimear7 jtimear6 * Step 4: Save the data file in wide form . save "X:\your_path\PS9804_WIDEv3.DTA", replace ************************************************** . * Construct a panel dataset in long form . * Notes: * 1. Syntax begins with panel dataset in wide form (see above) . * 2. Workplaces that were not re-interviewed in 2004 are dropped . * 3. A new variable - YEAR - identifies the year to which each row of data relates . * 4. Variables that are not wholly consistent between 1998 and 2004 are dropped . * The following vars from 2004 are not included in the long dataset. Users will be able to 'recover' * many of these by manipulating them to ensure consistency with 1998 (e.g. recoding Xauktot * and Yauktot so that they have a consistent code list) . * * Ysoc2000, Yauktot, Yacomp01-12, Yahowch01-13, Ybmanage, Ycfactor1-Yxcfact3, Ycspecia6 (empty), * Ydbriefu1-Yxdbrief3, Ydappoin1-Ydappoin5, Ydconsul1-Yxdconsu3, Yeviews-Yxeviews, Yenew, Yenewnum, * Yewider, Yewidnum, Yeudrec, Yerequest, Yfvarpay6 (empty), Yfprofit, Yfshar, Yfperfp, Yfmeasur1-Yxfmeasu3, * Yfindper1-Yxfindpe4, Yfnmperf, Yfnotpay, Yigroun10 (empty), Yifmoff, Yipatern, Yifamily8 (empty), * Yjnwtemp, Yjobsec, Yjtimear8 (empty), Ykmarket, Ykcompet, Yktarge10 (empty), Ykerfis, Ykoptb, * Ykimp, Yktable, Ykdeti, Ymrel, Ymlinkdat, Ymnextime * * 5. Note that the region and industry identifiers for 2004 added in Version 2 of the panel data file (Yasic92, Yasic80, Yasic2003, Ygor, Yssr) are also not included in the long dataset. ************************************************** . * Step 1: Read in the wide dataset, drop workplaces that were not re-interviewed and restructure the data file . * Inconsistent variables are dropped and a new index variable (YEAR) is created . use "X:\your_path\PS9804_WIDEv3.DTA", clear keep if pqwtnr<. reshape long @zallemps @zmalfull @zfemfull @zmalprt @zfemprt @ztotmen @ztotwom @zallpte @zallfte @zmng_tot @zpro_tot /* */ @ztec_tot @zcle_tot @zcrt_tot @zptc_tot @zsal_tot @zope_tot @zrou_tot @zethnic @zethnicp @ztu_mem @ztu_pc @zanymem /* */ @zscode @asic @asingle @acontrol @aconhead @aphras01 @aphras02 @aphras03 @aphras04 @aphras06 @aphras07 /* */ @aphras09 @aphras10 @bsex @btitle2 @bumanage @bproport @bboard @bstrateg @cfillvac @cspecia1 @cspecia2 @cspecia3 /* */ @cspecia4 @cspecia5 @cptests @coffjob @cothjob @cteams @cteamhoa @cteamhoc @cteamhod @dbrief @djoint @dissues /* */ @dmeet @dhighlev @dcircles @dpropor @dinvplan @dfinance @dwholefi @dstaffin @eanyemp @eemploy1 @eemploy2 /* */ @eemploy3 @eemploy4 @eemploy5 @eemploy6 @eemploy7 @eemploy8 @eemploy9 @eunionum @etotrec @ejoint @egroups /* */ @esteward @estewnum @estewext @estewtim @eothreps @enumreps @esiton @fvarpay1 @fvarpay2 @fvarpay3 @fvarpay4 /* */ @fvarpay5 @fsoc1 @fsoc2 @fsoc3 @fsoc4 @fsoc5 @fsoc6 @fsoc7 @fsoc8 @fsoc9 @ipolicy @igroun01 @igroun02 @igroun03 /* */ @igroun04 @igroun05 @igroun06 @igroun07 @igroun08 @igroun09 @ipracti1 @ipracti2 @ipracti3 @ipracti4 @ipracti5 @ipracti6 /* */ @ifamily1 @ifamily2 @ifamily3 @ifamily4 @ifamily5 @ifamily6 @ifamily7 @jnonem01 @jnonem02 @jnonem03 @jnonem04 /* */ @jnonem05 @jnonem06 @jnonem07 @jnonem08 @jnonem09 @jnonem10 @jnonem11 @jagency @jwrkfree @jhomwrk /* */ @jtimear1 @jtimear2 @jtimear3 @jtimear4 @jtimear5 @jtimear6 @jtimear7 @ktarge01 @ktarge02 @ktarge03 @ktarge04 /* */ @ktarge05 @ktarge06 @ktarge07 @ktarge08 @ktarge09 @mrelate, i(serno) j(yr) string encode yr, generate(year) recode year 1=1998 2=2004 keep serno zallemps-mrelate pqwtnr year * Step 2: The weight variable for re-interviewed cases is matched onto the resulting data file . * This step not required in Stata * Step 3: Save the data file in long form . save "X:\your_path\PS9804_LONGv3.dta"