Faq

FREQUENTLY ASKED QUESTIONS

The Harmonized COVID datasets can easily be merged with original study or Harmonized datasets using the study-specific unique id variable. Here is an example of Stata code to correctly merge the Harmonized HRS dataset with the Harmonized HRS COVID dataset:

use "filepath/H_HRS_COVID_b.dta"

merge 1:1 hhidpn using "filepath/H_HRS_d.dta"

Here is an example of Stata code to merge only select variables from the Harmonized HRS COVID dataset with the select variables from the Harmonized HRS dataset:

use variable1 variable2 variable3 using "filepath/H_HRS_COVID_b.dta"

merge 1:1 hhidpn using "filepath/H_HRS_d.dta",

keepusing(variable4 variable5 variable6)

To merge between Harmonized ELSA datasets, you change the file names and use idauniq instead of hhidpn. To merge between Harmonized SHARE datasets, you change the file names and use mergeid instead of hhidpn

It is very simple to merge the Harmonized datasets with the original study data using the unique identifiers employed by the study. You can identify the variables from the original study data you would like to use by searching or browsing for your items of interest on the Search page under the Health & Retirement Studies tab, or you can look through the original survey questionnaire or datasets. In Stata, you can merge in these original survey variables with the Harmonized data using the following Stata code:

merge 1:1 studyID using "filepath\dataset_name.dta", keepusing(variable1 variable2 variable3)

It is important to remember that all Harmonized datasets are individual-level, where each record is one person, but original survey data files can also be couple, household, community, or child-level datasets. All possible identifiers from each study are kept as part of the Harmonized dataset to allow for the merge with original survey datafiles which are not necessarily also individual-level. If the original survey dataset is not individual-level, then you will need to change the merge from 1:1 to m:1, 1:m, or m:m and use the appropriate identifier rather than the unique individual identifier. This method would also work when merging variables between the Harmonized HRS, RAND HRS, and RAND HRS Family datasets.

If you used any of the Harmonized Data or Codebooks in a written analysis, then please include the following acknowledgement in your written work (We also ask that you send an email to papers@g2aging.org to inform our team of any written analysis) :

"This analysis uses data or information from the Harmonized [Study] dataset and Codebook, Version [Letter] as of [Month & Year] developed by the Gateway to Global Aging Data. The development of the Harmonized [Study] was funded by the National Institute on Aging (R01 AG030153, RC2 AG036619, 1R03AG043052). For more information, please refer to g2aging.org."

If you used a working paper on cross-country comparability, then please cite the paper; for example,

"Jain U, Min J, Lee J. Harmonization of cross-national studies of aging to the Health and Retirement Study - user guide: Family transfer - informal care. University of Southern California, CESR-Schaeffer Working Paper Series No. 2016-008. Published January 2016."

If you used a graph, table, or map, then please cite the following:

"This graph uses data from the Gateway to Global Aging Data (g2aging.org). The Gateway to Global Aging Data is funded by the National Institute on Aging (R01 AG030153)."

If you used cross-study or longitudinal concordance information, recent presentations, or other tools on the Gateway to Global Aging Data, then please cite the following:

"Gateway to Global Aging Data, Produced by the Program on Global Aging, Health & Policy, University of Southern California with funding from the National Institute on Aging (R01 AG030153).

You can find an overview of our existing Harmonized datasets on the Overview page under Harmonized Data. We are constantly updating our existing Harmonized datasets and adding new Harmonized datasets.

If you don't see a Harmonized version of a study which is important to your research, please write to us at help@g2aging.org and we will provide you our best estimate of when that Harmonized dataset will be available.

First, you should sign up for the Gateway to Global Aging Data by clicking the person icon on the top right of any page. If you have already signed up, then make sure you're logged in to the Gateway website.

For information regarding how to gain access to the survey and harmonized data for each study, go to the Get Data page under the Health & Retirement Studies tab and click on one of the corresponding boxes underneath "Data Access Instructions".

The Gateway's Stata Creation code pulls original survey variables directly into Stata's working memory to create Harmonized variables. To pull in the correct original survey variable from the survey data requires the specification of the exact file name of each original survey dataset. Many studies update the filenames of the datasets between release versions. If you are given a "file not found" error message when running Stata, you may not be using the most recent release of the survey data. Please make sure you have the latest version of the survey data. If you still encounter an error, please write to us at help@g2aging.org and we will help you as quickly as possible.

Some versions of Stata only allow users to read fewer variables into working memory than are in some of our Harmonized datasets (e.g. Stata/IE). All versions of Stata will allow users to pull select variables into Stata from a dataset with more variables than it could read at once. You can identify the variables you would like to use by searching or browsing for your items of interest on the Search page under the Harmonized Data tab, or you can download and search through the codebook that accompanies the Harmonized dataset on the Get Data page under the Harmonized Data tab. You can create a smaller dataset for your personal use by updating the variable names, filepath, and dataset name in the following Stata code:

use variable1 variable2 variable3 using "filepath\H_dataset.dta"

We are constantly making content updates to the Gateway and try our best to inform our users when new data or documentation becomes available. You can view all recent Announcements & Data Updates under our Additional Resources tab.

To make sure you don't miss any of our future data releases, you should click on the person icon at the top right of any page, click on My Profile, and make sure that the box for "Data Alerts" is checked. You will then receive an email when we release new datasets and update existing datasets. For any additional information regarding Gateway-affiliated events (e.g., seminars, webinars, conferences, Hackathon), data releases, and publications, please follow us on our Twitter page. If you are unsure whether the dataset or information you are using is the most updated version, please write to us at help@g2aging.org for clarification.

How do I merge the Harmonized COVID dataset with the Harmonized Core dataset?

use "filepath/H_HRS_COVID_b.dta"

merge 1:1 hhidpn using "filepath/H_HRS_d.dta"

Here is an example of Stata code to merge only select variables from the Harmonized HRS COVID dataset with the select variables from the Harmonized HRS dataset:

use variable1 variable2 variable3 using "filepath/H_HRS_COVID_b.dta"

merge 1:1 hhidpn using "filepath/H_HRS_d.dta",

keepusing(variable4 variable5 variable6)

merge 1:1 studyID using "filepath\dataset_name.dta", keepusing(variable1 variable2 variable3)

How do I cite the Gateway?

If you used a working paper on cross-country comparability, then please cite the paper; for example,

If you used a graph, table, or map, then please cite the following:

"This graph uses data from the Gateway to Global Aging Data (g2aging.org). The Gateway to Global Aging Data is funded by the National Institute on Aging (R01 AG030153)."

If you used cross-study or longitudinal concordance information, recent presentations, or other tools on the Gateway to Global Aging Data, then please cite the following:

"Gateway to Global Aging Data, Produced by the Program on Global Aging, Health & Policy, University of Southern California with funding from the National Institute on Aging (R01 AG030153).

I am interested in using a study which I see on the Gateway but I don't see a Harmonized version of this study. What studies do you Harmonize?

How can I register for the original study so that I can gain access to the study data and Harmonized datasets?

The Stata code I downloaded from the Gateway uses different file names than the data I have. Why is this?

My version of Stata won't let me open the Harmonized dataset because there are too many variables. What can I do?

use variable1 variable2 variable3 using "filepath\H_dataset.dta"

How can I stay up-to-date with data releases and other Gateway-related news?