Final Project
Written By: Ling Xin Low, Zeyu Tian, and Dongwei Zhu
Introduction
This project has two parts: a regression of our own and regression duplications of three articles that have utilized the same dataset. Our regression attempts to examine how certain variables affect the duration of how long partners stay together. It will be answering the question of which variables seem to be the most statistically significant in determining how long couple stay together. We will be using data from the How Couples Meet and Stay Together (HCMST) dataset to determine if variables like ethnicity and education affect the length of a couple’s relationship.
Variables
Here is a description of each variable that we will use in our regression:
how_long_relationship reports the respondent’s relationship duration in years
ppage reports the respondent’s age in years
pphouseholdsize reports the household size of the respondent
ppeduc indicates the highest degree of education received by the respondent with the following
classifications: o 1
o 2 o 3 o 4 o 5 o 6 o 7 o 8 o 9 o 10 o 11 o 12 o 13 o 14
No formal education
1st, 2nd, 3rd, or 4th grade
5th or 6th grade
7th or 8th grade
9th grade
10th grade
11th grade
12th grade no diploma
High school diploma or equivalent Some college, no degree Associate degree
Bachelor’s degree
Master’s degree
Professional or doctorate degree
ppethm indicates the race/ethnicity of the respondent, with the following classifications:
White, Non-Hispanic Black, Non-Hispanic Other, Non-Hispanic Hispanic
2+ races, Non-Hispanic
papreligion indicates the religion of the respondent, with the following classifications:
o1 Baptist
o2 Protestant
o3 Catholic
o4 Mormon
o5 Jewish
o6 Muslim
o 1 o 2 o 3 o 4 o 5
HCMST is a nationally representative
longitudinal survey of 4,002 English literate adults, of whom 3,009 had a spouse or romantic partner.
Using the HCMST data, we will estimate the following model:
how_long_relationship = ɓ0 + ɓ1age + b2householdsize + ɣ1education + ɣ2ethnicity + ɣ3religion +
ɣ4glbstatus + ɣ5maritalstatus + ɣ6workingstatus
Regression Results
o 7 o 8 o 9 o 10 o 11 o 12 o 13
Hindu
Buddhist Pentecostal
Eastern Orthodox Other Christian Other Non-Christian None
glbstatus indicates if respondent is lesbian, gay, or bi, with a ‘0’ indicating that they are not and a ‘1’ indicating that they are
ppmarit indicates the marital status of the respondent, with the following classifications:
o 1 o 2 o 3 o 4 o 5 o 6
Married
Widowed Divorced Separated
Never married Living with partner
ppwork indicates the respondent’s employment status, with the following classifications:
o 1 o 2 o 3 o 4 o 5 o 6 o 7
Working – as a paid employee
Working – self-employed
Not working – on temporary layoff from a job Not working – looking for work
Not working – retired
Not working – disabled
Not working – other
We modeled the variables using linear regression. We ensured that our model did not violate the assumption of linearity. To negate the assumption of variance, we used robust standard errors in our regression. For the categorical variables, we entered them as indicator variables, this way we can examine those variables in relation to the first omitted variable. We are careful to not let overfitting occur in our regression by having at least a ratio of 10:1 for our sample size and variables.
Coefficient Robust Standard Errors
PPage 0.614 *** (0.018)
PPhouseholdsize
0.282 *** (0.109)
PPeduc
1st, 2nd, 3rd, or 4th grade 5.110 *** (1.823)
5th or 6th grade
1.624 (2.068)
7th or 8th grade 3.368 * (2.066)
9th grade
-1.621 (2.172)
10th grade -0.294 (1.832)
11th grade 12th grade no diploma High school diploma or the equivalent Some college, no degree Associate degree Bachelor’s degree Master’s degree Professional or Doctorate degree
0.185 -1.557 -0.766 -1.598 ** -1.999 ** -3.483 *** -2.720 *** -3.190 ***
-0.786 0.906
-0.961 * -0.553
-0.441 -0.176 0.314 1.092 -1.777 -0.842 -0.861
-2.873 ** 1.530
-2.075 *** -0.059 -0.868
-1.631 ***
-7.722 *** -13.946 *** -6.080 *** -7.646 *** -6.920 ***
0.429
1.027 -0.594
4.767 *** -0.707
1.378 *** -5.793 ***
(1.793) (1.616) (1.085) (1.141) (1.211) (1.126) (1.224) (1.340)
(0.794) (0.892) (0.530) (0.912)
(0.653) (0.634) (1.110) (1.090) (2.035) (2.249) (1.349) (1.280) (2.023) (0.682) (1.052) (0.641)
(0.560)
(2.463) (0.827) (1.605) (0.495) (0.543)
(0.605) (1.375) (0.688) (0.907) (0.762) (0.522) (1.765)
PPethm
Religion
Black, Non-Hispanic Other, Non-Hispanic Hispanic 2+ races, Non-Hispanic
Protestant Catholic Mormon Jewish Muslim Hindu Buddhist Pentecostal Eastern Orthodox Other Christian Other Non-Christian None
LGB(Lesbian, Gay, or Bi) status
Widowed Divorced Separated Never Married Living with Partner
GLBstatus PPmarit
PPwork
Working – Self-employed Not working – On temporary layoff from a job Not working – Looking for work Not working – Retired Not working – Disabled Not working – Other
Constant
* p value < 0.1 ; ** p value < 0.05 ; *** p value < 0.01
The regression has a sample size of 2,973 and an R-squared of 0.6423. The regression has correctly estimated all of the betas.
For every one more year the respondent’s age is, the longer he or she will stay with their partner by .61 years, which is approximately seven months. All the non-categorical variables, age and the household size, are all statistically significant predictors of how long couples stay together. A bigger household size (with members in the house more than 50% of the time) results in a longer relationship, with an increase of .282 years, which is approximately three months.
Each of the categorical variables should be interpreted in relation to the omitted category. Notably, a
higher educational background (post high school education) results in a shorter relationship in general
when compared to a person that received no education. However, people with their highest education
being 1st grade to 8th grade have stay longer in a relationship than those who did not receive any
education. Hispanic people are likely to have a shorter relationship than White non-Hispanic people.
Other Christians not included in the survey and Pentecostal people have a shorter relationship than
Baptists. Straight people stay longer together than people who identify as LGB. People who are not
married all have shorter durations of relationship, especially divorced persons. Retired people stay in
relationships for a longer period of time than employed people, which is not all surprising since retirees
tend to be older.
Conclusion
Results of our regression conclude that the age and the number of people in the household are the most statistically significant variables in determining the length of a couple’s relationship. They all have a p value of less than 0.01, thus we are inclined to reject the null hypothesis that this result is due to pure randomness.
Replication of Regressions
Searching for a Mate: The Rise of the Internet as a Social Intermediary
Duplicated Regression Results of Family and the Internet's Influence on Couple Type
Same-Sex Couples
Interreligious Couples
Respondent/Partner education gap >= 4 years
Met through Family Odds Ratio
0.11*** 0.81* 1.84***
Met Online Odds Ratio 5.11***
1.43** 0.51**
Interracial Couples
0.61 1.54**
Mothers’ Educations differ by >= 4 years
0.73* 0.93
Respondent/Partner age gap >= 10 years
0.82 0.94
* p value < 0.1 ; ** p value < 0.05 ; *** p value < 0.01
There are obvious differences for the odds ratio of couples met through family in Respondent/Partner education gap >=4 years and respondent/partner age gap >= 10 years between our duplicate results and initial results, 0.78 and 0.26 respectively. For couples that met online, same-sex couples has a big odds
ratio difference which is a gap of about 1.77, interracial couples has a 0.69 gap, respondent/partner education gap>= 4 years has 0.76 difference, and respondent/partner age gap >=10 has a 0.24 gap.
Stata Do-File:
gen samerace_couples = 1 if respondent_race == partner_race
replace samerace_couples = 0 if samerace_couples == .
gen respondent_religion = 1 if papreligion == 2
replace respondent_religion = 2 if papreligion == 3
replace respondent_religion = 3 if papreligion == 5
replace respondent_religion = 4 if papreligion == 12
replace respondent_religion = 5 if papreligion == 13
gen partner_religion = 1 if q7b == 2
replace partner_religion = 2 if q7b == 3
replace partner_religion = 3 if q7b == 5
replace partner_religion = 4 if q7b == 12
replace partner_religion = 5 if q7b == 13
replace respondent_race = 0 if respondent_race == 6
replace respondent_race = 6 if respondent_race == 6
replace respondent_race = 7 if respondent_race == .
replace respondent_race = 6 if respondent_race == 0
replace respondent_race = 0 if respondent_race > 5
gen same_religion_couples = 1 if respondent_religion == partner_religion
replace same_religion_couples = 0 if respondent_religion == .
gen mother_education = 1 if respondent_mom_yrsed < partner_mom_yrsed + 4
replace mother_education = 0 if mother_education == .
gen respondent_education = 1 if respondent_yrsed < partner_yrsed + 4
replace respondent_education = 0 if respondent_education == .
gen age_gap = 1 if age_difference < 10
replace age_gap = 0 if age_difference == .
logistic met_through_family samerace_couples same_religion_couples mother_education
respondent_education age_gap
logistic met_through_family samerace_couples same_religion_couples mother_education
respondent_education age_gap,or
gen interracial_couples = 1 if samerace_couples == 0
gen interreligious_couples = 1 if same_religion_couples == 0
gen mother_education_gap = 1 if mother_education == 0
gen respondent_education_gap = 1 if respondent_education == 0 gen age_gap1 = 1 if age_gap == .
replace interracial_couples = 0 if samerace_couples == 1
replace interreligious_couples = 0 if same_religion_couples == 1 replace mother_education_gap = 0 if mother_education == 1
replace respondent_education_gap = 0 if respondent_education == 1 replace age_gap1 = 0 if age_gap == 1
replace age_gap1 = 1 if age_gap == 0
replace q5 = 3 if q5 == -1
replace q5 = 3 if q5 == .
gen same_sex_couples = 1 if q5 == 1
replace same_sex_couples = 0 if same_sex_couples == .
logistic met_through_family same_sex_couples interracial_couples interreligious_couples mother_education_gap respondent_education_gap age_gap1
logistic either_internet_adjusted same_sex_couples interracial_couples interreligious_couples mother_education_gap respondent_education_gap age_gap1
Couple Longevity in the Era of Same-Sex Marriage in the United States
Model 1:
Model 2:
Model 3:
Same Sex Couples 0.71** 0.10
Coefficient Standard Error
Same Sex Couples 0.58** 0.08
Coefficient Standard Error
Same Sex Couples 0.60*** 0.09
Marriage
0.94*** 0.08
Coefficient Standard Error
Marriage
0.68*** 0.10
Co-residence -0.61** 0.16
Relationship Duration (years)
-0.02
0.01
Relationship Duration -1.42 0.22
Model 4:
Coefficient Standard Error
Same Sex Couples 0.67*** 0.10
Marriage
0.80*** 0.13
Same Sex Married 0.36 0.21
Co-residence
-0.61*** 0.16
Relationship Duration -0.02** (years)
0.01
Relationship Duration
-1.42*** 0.22
Model 5:
Coefficient
Standard Error
Same Sex Couples 0.33** 0.11
Marriage
0.94
0.15
Same Sex Married 0.20*** 0.25
Co-residence
-0.95***
0.16
Relationship Duration (years)
-0.04***
0.01
Relationship Duration
-1.00** 0.32
Relationship Quality 1.26*** 0.06
Model 6:
Coefficient Standard Error
Same Sex Couples 0.27 0.12
Marriage
0.98*** 0.15
Same Sex Married 0.32 0.26
Co-residence
-0.87*** 0.16
Relationship Duration -0.04*** (years)
0.01
Relationship Duration
-1.05** 0.31
Relationship Quality 1.32*** 0.07
Heterosexual Couples
-0.40*** 0.08
Household Income 0.05*** 0.01
Model 7:
Coefficient Standard Error
Gay Couples -0.27 0.22
Lesbian Couples
-0.51 0.38
Marriage 0.68** 0.23
Same Sex Married
0.08 0.42
Co-residence -0.87** 0.25
Relationship Duration (years)
Relationship Duration -0.75 1.05
-0.09***
0.02
Relationship Quality
0.51*** 0.14
Heterosexual Couples
0.23
0.30
Household Income
0.01
0.02
* p value < 0.1 ; ** p value < 0.05 ; *** p value < 0.01
In model 1, the coefficient and the standard error are different when compared to the article. In model 2, our table’s coefficient of same sex couples and standard error are different. For table 2, the coefficient of model is negative, but in our table, the coefficient is bigger than 0. In model 3, the standard error of (co-residence) and (relationship duration, year) is similar. In model 4, the standard errors of co-residence and relationship duration in years are similar. In model 5, most of the coefficients of article’s table are different with our table’s coefficients. In model 6, most of the coefficients are different. In model 7, the standard errors are almost similar to our table’s standard errors except the coefficient of relationship duration.
Stata Do-file:
gen breakup = w2w3_combo_breakup if w2w3_combo_breakup>0
replace breakup = 0 if breakup == .
gen samesexcpl = q5 if q5>0
replace samesexcpl = 0 if samesexcpl == .
ologit breakup samesexcpl
gen marriage = s1 if s1>1
replace marriage = 0 if marriage == .
ologit breakup samesexcpl marriage
gen coresidence = coresident if coresident>0
replace coresidence = 0 if coresidence == .
gen relationshipdy = how_long_relationship if how_long_relationship>0
replace relationshipdy = 0 if relationshipdy == .
gen relationshipdur = duration^-0.5 if duration>0
replace relationshipdur = 0 if relationshipdur == .
ologit breakup samesexcpl marriage coresidence relationshipdy relationshipdur
gen samesexmar = 1 if marriage == samesexcpl
replace samesexmar = 0 if samesexmar == .
ologit breakup samesexcpl marriage samesexmar coresidence relationshipdy relationshipdur gen relationshipq = q34 if q34>0
replace relationshipq = 0 if relationshipq == .
ologit breakup samesexcpl marriage samesexmar coresidence relationshipdy relationshipdur relationshipq
gen heterosexualcpl = w3_xtype if w3_xtype>0
gen householdin = pp2_ppincimp if pp2_ppincimp>0
replace householdin = 0 if householdin == .
ologit breakup samesexcpl marriage samesexmar coresidence relationshipdy relationshipdur relationshipq heterosexualcpl householdin
gen samecpl = s1a if s1a>0
replace samecpl = 0 if samecpl == .
gen gendergay = ppgender if ppgender>1
replace gendergay = 0 if gendergay == .
gen gaycpl = 1 if samecpl == gendergay
replace gaycpl = 0 if gaycpl == .
gen genderlesbian = ppgender if ppgender>2
replace genderlesbian = 0 if genderlesbian == .
gen lesbiancpl= 2 if samecpl == genderlesbian
replace lesbiancpl = 0 if lesbiancpl == .
ologit breakup gaycpl lesbiancpl marriage samesexmar coresidence relationshipdy relationshipdur relationshipq heterosexualcpl householdin
Earnings Equality and Relationship Stability for Same-Sex and Heterosexual Couples
Model 1:
Model 2:
Model 3:
Relationship Quality
Coefficient Standard Error
Same Sex Couple 0.675 *** (0.053)
Same Earnings
0.697 *** (0.095)
Relationship Quality
Coefficient Standard Error
Same Sex Couple 0.272 *** (0.081)
Same Earnings
0.250 ** (0.116)
Same Sex and Same Earnings -0.763 *** (0.116)
Relationship Quality
Coefficient Standard Error
Same Sex Couple 0.359 *** (0.087)
Same Earnings
0.080 (0.124)
Same Sex and Same Earnings -1.063 *** (0.125)
Years of Education
-0.048 *** (0.013)
Working -0.102 (0.070)
Race
Non-Hispanic Black 0.377 *** (0.113)
Non-Hispanic Other
0.282 * (0.159)
Hispanic 0.128 (0.102)
Religion
1 -0.165 * (0.093)
2
0.039 (0.085)
Age -0.017 *** (0.002)
Staying Together
1.517 *** (0.066)
Model 4:
Relationship Quality
Coefficient Standard Error
Same Sex Couple 0.016 (0.097)
Same Earnings
-0.234 * (0.138)
Same Sex and Same Earnings -0.043 (0.139)
Years of Education
-0.012 (0.016)
Working -0.158 * (0.090)
Race
Non-Hispanic Black 0.490 *** (0.134)
Non-Hispanic Other
0.313 * (0.186)
Hispanic 0.113 (0.123)
Religion
Non-Christian -0.087 (0.117)
Non-Religious
0.052 (0.102)
Age -0.001 (0.004)
Staying Together
-0.522 *** (0.089)
Length of Relationship -0.007 (0.004)
Household Income
$5,000 to $7,499 -0.046 (0.592)
$7,500 to $9,999
0.743 (0.549)
$10,000 to $12,499 0.927 * (0.543)
$12,500 to $14,999
0.796 (0.518)
$15,000 to $19,999 0.454 (0.499)
$20,000 to $24,999
0.504 (0.487)
$25,000 to $29,999 0.474 (0.489)
$30,000 to $34,999
0.594 (0.483)
$35,000 to $39,999 0.543 (0.477)
$40,000 to $49,999
0.213 (0.470)
$50,000 to $59,999 0.327 (0.468)
$60,000 to $74,999
0.222 (0.469)
$75,000 to $84,999 -0.057 (0.478)
$85,000 to $99,999
0.274 (0.473)
$100,000 to $124,999 0.252 (0.475)
$125,000 to $149,999
0.184 (0.490)
$150,000 to $174,999 0.032 (0.515)
$175,000 or more
-0.017 (0.499)
Children 0.212 ** (0.090)
* p value < 0.1 ; ** p value < 0.05 ; *** p value < 0.01
Our duplicated regressions have differing results from the original. This could be the case because we
have utilized Waves 1 to 5 while the original only used Wave 1 of the dataset. Comparing both Model 4s,
we both have very significant results in the factor of Married/Domestic Partnership/Civil Union. There is
a pretty large difference in the following independent variables: Same Sex Couple, Same Earnings, Same
Sex Couple and Same Earnings, Years of Education, and Working Status where we have a majority of
negative values while the original article have only positive values.
Stata Do-File:
gen relationshipq = q34 if q34 > 0
replace relationshipq = 0 if relationshipq == .
gen samesexcpl = q5 if q5 > 0
replace samesexcpl = 0 if samesexcpl == .
gen sameearn = q23 if q23 == 2
replace sameearn = 1 if sameearn == 2
replace sameearn = 0 if sameearn == .
ologit relationshipq samesexcpl sameearn
gen sameearnsex = 1 if samesexcpl == sameearn replace sameearnsex = 0 if sameearnsex == .
ologit relationshipq samesexcpl sameearn sameearnsex gen worker = ppwork
replace worker = 0 if worker <= 2
replace worker = 1 if worker > 0
gen race = 0 if respondent_race == 1
replace race = 1 if respondent_race == 2 replace race = 3 if respondent_race == 6 replace race = 10 if respondent_race == . replace race = 2 if race == .
replace race = . if race == 10
gen religious = 2 if papreligion == 13
replace religious = 0 if papreligion
replace religious = 0 if papreligion
replace religious = 0 if papreligion
replace religious = 0 if papreligion
replace religious = 0 if papreligion
replace religious = 1 if papreligion
replace religious = 1 if papreligion
replace religious = 1 if papreligion
replace religious = 1 if papreligion
replace religious = 1 if papreligion
replace religious = 1 if papreligion
replace religious = 1 if papreligion
gen staytgt = 1 if q18a_3 == 0
replace staytgt = 1 if ppmarit == 1
replace staytgt = 0 if staytgt == .
ologit relationshipq samesexcpl sameearnsex respondent_yrsed worker i.race i.religious ppage staytgt gen children = 0 if children_in_hh == 0
replace children = 1 if children_in_hh >= 1
ologit relationshipq samesexcpl sameearnsex respondent_yrsed worker i.race i.religious ppage staytgt how_long_relationship ppincimp children
== 1 == 2 == 3 == 10 == 11 == 4 == 5 == 6 == 7 == 8 == 9 == 12
References
Rosenfeld, Michael J., Reuben J. Thomas, and Maja Falcon. 2015. How Couples Meet and Stay
Together, Waves 1, 2, and 3: Public version 3.04, plus wave 4 supplement version 1.02 and wave 5
supplement version 1.0 [Computer files]. Stanford, CA: Stanford University Libraries.
Rosenfelda, M., & Thomasb, R. (2012, June 13). Searching for a Mate. Retrieved November 30, 2015,
fromhttp://asr.sagepub.com/content/77/4/523
Rosenfeld, M. (2014). Couple Longevity in the Era of Same-Sex Marriage in the United States. Fam Relat
Journal of Marriage and Family, 905-918. Retrieved November 30, 2015,
from http://onlinelibrary.wiley.com/doi/10.1111/jomf.12141/abstract
Weisshaar, K. (2014). Earnings Equality and Relationship Stability for Same-Sex and Heterosexual
Couples. Social Forces, 93-123. Retrieved November 30, 2015,
from http://web.stanford.edu/~mrosenfe/how_meet_public/Weisshaar_earnings_equality_SF.pdf