*******************
*      Stata      *
*******************

*Syntaxes for assigning parental proxy information for variables to target persons. 

*General note: The proxy information was usually given by only one (step)parent, so most of the variables generated will not contain many values (especially with stepparents as the informants) and many missing values. 
However, in some cases there are multiple proxy values for one target person. Therefore, we provide a syntax that assigns each possible parental proxy information to the target persons so that you can choose which one to use.  
  
*--------------------------------
Stata syntax for assigning parental proxy information for variables to target persons:
*--------------------------------

* Assign parent's proxy information for edu0100t to twin 1:

tempvar m n g f // person code letters of parents and stepparents
local p = 300 // start with ptyp == 300 (mother)
foreach i in m f g n { // do loop over mother, father, stepmother and stepfather
	gen ``i'' = edu0100t if ptyp==`p' // store the parent-about-child information (valid or missing) given by parents or stepparents in temporary variables
	local p = `p' + 100 // count up the loop: from mother to father to stepfather to stepmother (ptyp == 400; 500; 600)
	bys fid: egen edu0100t_`i' = max(``i'') // generate variable with the only valid/non-missing information (= maximum) stored in the temporary variables
	replace edu0100t_`i' = . if ptyp != 1 // only keep information for ptyp == 1 (twin 1), replace with missing for all others
	}

* Assign parent's proxy information for edu0100u to twin 2:

tempvar m n g f // person code letters of parents and stepparents
local p = 300 // start with ptyp == 300 (mother)
foreach i in m f g n { // do loop over mother, father, stepmother and stepfather
	gen ``i'' = edu0100u if ptyp==`p' // store the parent-about-child information (valid or missing) given by parents or stepparents in temporary variables
	local p = `p' + 100 // count up the loop: from mother to father to stepfather to stepmother (ptyp == 400; 500; 600)
	bys fid: egen edu0100u_`i' = max(``i'') // generate variable with the only valid/non-missing information (= maximum) stored in the temporary variables
	replace edu0100u_`i' = . if ptyp != 2 // only keep information for ptyp == 2 (twin 2), replace with missing for all others
	}


* Assign parent's proxy information for edu0100s to sibling:

tempvar m n g f // person code letters of parents and stepparents
local p = 300 // start with ptyp == 300 (mother)
foreach i in m f g n { // do loop over mother, father, stepmother and stepfather
	gen ``i'' = edu0100s if ptyp==`p' // store the parent-about-child information (valid or missing) given by parents or stepparents in temporary variables
	local p = `p' + 100 // count up the loop: from mother to father to stepfather to stepmother (ptyp == 400; 500; 600)
	bys fid: egen edu0100s_`i' = max(``i'') // generate variable with the only valid/non-missing information (= maximum) stored in the temporary variables
	replace edu0100s_`i' = . if ptyp != 200 // only keep information for ptyp == 200 (sibling), replace with missing for all others
	}


*##########################################################################################################################.

*******************
*       SPSS      *
*******************

*--------------------------------
SPSS syntax for assigning parental proxy information for variables to target persons:
*--------------------------------
*Notes:
*Missing values must be declared, otherwise the syntax will not work properly. Missing values will be ignored.
*In addition, the Python extension for spss must be installed for the code to work. 
*Also, the dataset you want to edit must be the active dataset and must be a long format dataset. 
*Data should be sorted by family ID (lower numbers first).
*In the line targetvariable = "edu0100t", simply replace the term in "xxx" with the variable you are interested in (e.g. edu0100s or int0100t). 
*The syntax automatically provides the correct variables for the different types of persons. 
*Examples:

*----.
*Assign the parent's proxy information for edu0100t to twin 1:

begin program python3. 
import spss, spssaux, re                 # import necessary packages from python
informants = [300,400,500,600]         # declare possible informant person types
targetvariable = "edu0100t"                # DECLARE TARGETVARIABLE, change the code only here, you don't need to change it in other places
patternt = re.compile(".......t$")          # define variable pattern for first born twins.
patternu = re.compile(".......u$")        # define variable pattern for second born twins.
patterns = re.compile(".......s$")       # define variable pattern for siblings
siblingtype = 0
if patternt.search(targetvariable):        # checking if the target variable matches a variable for first born twins, then set sibling-type to 1 (firstborn twin)
    siblingtype=1
elif patternu.search(targetvariable):         # checking if the target variables matches a variable for second born twins, then set sibling-type to 2 (second born twin)
    siblingtype=2
elif patterns.search(targetvariable):         # checking if the target variables matches a variable for sibling, then set sibling-type to 200 (sibling)
    siblingtype=200
if siblingtype == 1 or siblingtype == 2 or siblingtype == 200: # code does nothing, when not a proxy report 
    for p in informants:  # loop through informant types 
        parenttype = p 
        if p == 300:       # set names for final variable according to informant type 
            targetname = targetvariable + "_m"   
        if p == 400:
            targetname = targetvariable + "_f" 
        if p == 500:
            targetname = targetvariable + "_g" 
        if p == 600:
            targetname = targetvariable + "_n" 
        # restrict dataset to sibling-type of target-variable and parent-type of the respective loop, we are in (sel if)
        # split the dataset by family ID (Aggregate + break)
        # in each family, take the last non-missing value (there should be at most one value) and insert it into the dummy variable, which has the same value for each person 
        # considered for this operation in the family (one sibling and one parent).
        # delete the unnecessary duplicate value stored with the parent.
        # Rename the dummy variable to make it meaningful.
        # in the spss.submit function, there can only be placeholders (%s) for the variable names, so they need to be declared here in their respective order
        # of occurrence. 
        spss.Submit("""
        temp. 
        sel if (ptyp = %s OR ptyp = %s). 
        Aggregate
        /break =fid                       
        /dummy = LAST(%s).       
        if (ptyp = %s) dummy = $SYSMIS.     
        Rename Variables dummy = "%s"."""  
        %(parenttype,siblingtype,targetvariable,parenttype,targetname)) 
end program.



*----
*Assign the parent's proxy information for edu0100u to twin 2:


begin program python3. 
import spss, spssaux, re                 # import necessary packages from python
informants = [300,400,500,600]         # declare possible informant person types
targetvariable = "edu0100u"                # DECLARE TARGETVARIABLE, change the code only here, you don't need to change it in other places
patternt = re.compile(".......t$")          # define variable pattern for first born twins.
patternu = re.compile(".......u$")        # define variable pattern for second born twins.
patterns = re.compile(".......s$")       # define variable pattern for siblings
siblingtype = 0
if patternt.search(targetvariable):        # checking if the target variable matches a variable for first born twins, then set sibling-type to 1 (firstborn twin)
    siblingtype=1
elif patternu.search(targetvariable):         # checking if the target variables matches a variable for second born twins, then set sibling-type to 2 (second born twin)
    siblingtype=2
elif patterns.search(targetvariable):         # checking if the target variables matches a variable for sibling, then set sibling-type to 200 (sibling)
    siblingtype=200
if siblingtype == 1 or siblingtype == 2 or siblingtype == 200: # code does nothing, when not a proxy report 
    for p in informants:  # loop through informant types 
        parenttype = p 
        if p == 300:       # set names for final variable according to informant type 
            targetname = targetvariable + "_m"   
        if p == 400:
            targetname = targetvariable + "_f" 
        if p == 500:
            targetname = targetvariable + "_g" 
        if p == 600:
            targetname = targetvariable + "_n" 
        # restrict dataset to sibling-type of target-variable and parent-type of the respective loop, we are in (sel if)
        # split the dataset by family ID (Aggregate + break)
        # in each family, take the last non-missing value (there should be at most one value) and insert it into the dummy variable, which has the same value for each person 
        # considered for this operation in the family (one sibling and one parent).
        # delete the unnecessary duplicate value stored with the parent.
        # Rename the dummy variable to make it meaningful.
        # in the spss.submit function, there can only be placeholders (%s) for the variable names, so they need to be declared here in their respective order
        # of occurrence. 
        spss.Submit("""
        temp. 
        sel if (ptyp = %s OR ptyp = %s). 
        Aggregate
        /break =fid                       
        /dummy = LAST(%s).       
        if (ptyp = %s) dummy = $SYSMIS.     
        Rename Variables dummy = "%s"."""  
        %(parenttype,siblingtype,targetvariable,parenttype,targetname)) 
end program.

*----
*Assign the parent's proxy information for edu0100s to sibling:


begin program python3. 
import spss, spssaux, re                 # import necessary packages from python
informants = [300,400,500,600]         # declare possible informant person types
targetvariable = "edu0100s"                # DECLARE TARGETVARIABLE, change the code only here, you don't need to change it in other places
patternt = re.compile(".......t$")          # define variable pattern for first born twins.
patternu = re.compile(".......u$")        # define variable pattern for second born twins.
patterns = re.compile(".......s$")       # define variable pattern for siblings
siblingtype = 0
if patternt.search(targetvariable):        # checking if the target variable matches a variable for first born twins, then set sibling-type to 1 (firstborn twin)
    siblingtype=1
elif patternu.search(targetvariable):         # checking if the target variables matches a variable for second born twins, then set sibling-type to 2 (second born twin)
    siblingtype=2
elif patterns.search(targetvariable):         # checking if the target variables matches a variable for sibling, then set sibling-type to 200 (sibling)
    siblingtype=200
if siblingtype == 1 or siblingtype == 2 or siblingtype == 200: # code does nothing, when not a proxy report 
    for p in informants:  # loop through informant types 
        parenttype = p 
        if p == 300:       # set names for final variable according to informant type 
            targetname = targetvariable + "_m"   
        if p == 400:
            targetname = targetvariable + "_f" 
        if p == 500:
            targetname = targetvariable + "_g" 
        if p == 600:
            targetname = targetvariable + "_n" 
        # restrict dataset to sibling-type of target-variable and parent-type of the respective loop, we are in (sel if)
        # split the dataset by family ID (Aggregate + break)
        # in each family, take the last non-missing value (there should be at most one value) and insert it into the dummy variable, which has the same value for each person 
        # considered for this operation in the family (one sibling and one parent).
        # delete the unnecessary duplicate value stored with the parent.
        # Rename the dummy variable to make it meaningful.
        # in the spss.submit function, there can only be placeholders (%s) for the variable names, so they need to be declared here in their respective order
        # of occurrence. 
        spss.Submit("""
        temp. 
        sel if (ptyp = %s OR ptyp = %s). 
        Aggregate
        /break =fid                       
        /dummy = LAST(%s).       
        if (ptyp = %s) dummy = $SYSMIS.     
        Rename Variables dummy = "%s"."""  
        %(parenttype,siblingtype,targetvariable,parenttype,targetname)) 
end program.