Skip to content

Data editing and management: The checks on the derived variables

Back to sections

The checks on the derived variables

Once the derived variables had been created, a variety of checks were performed to ensure that they had been calculated correctly.

  • Checking activity and sports measures

    The main activity-related derived variables are created for multiple composite activities (as described on the previous page). 

    Initially, checks were carried out at the activity level. Firstly, the hard logic of the syntax used to derive each measure was checked against the specification.

    Next, selected activities were tested by cross-tabulating the raw (source) variables against the derived variables to confirm that the data matched as it should.

    Extensive checks were carried out on the derived variables, ensuring that the derived variables matched the specification at every step of the derivation. This was done by creating temporary ‘checking variables’ where syntax was written to match the specification.

    The temporary variable was then compared to the actual variable and any cases which did not match exactly would reveal where there may have been any problems with the derivation. These cases were investigated and dealt with appropriately. 

    This was done in steps to ensure that the relationship between raw and derived variables was built up correctly.  Checks were used to ensure that:

    • day of doing an activity was only recorded where the child reported an activity in the last week
    • minutes were only recorded for an activity reported on for at least one day in the last week
    • the derived intensity variables for each activity corresponded to the answers given to the survey questions
    • the minutes of activity in setting (during or outside normal school hours, indoors or outdoors) related to answers given to the questions
    • the minutes of activity in derived variables corresponded to the time reported or the correct figure for imputation for that activity and year group.

    In addition, checks were conducted to ensure that other answers were feeding into participation measures correctly.

    The composite sports variables were only created once it was confirmed that the individual activity variables had been derived correctly.

    Checks were also carried out to ensure that the correct activities fed into each composite, which would then be used for multiple participation variables.

    Primarily, the SPSS syntax was checked against the specification (which was itself checked and signed off by the Sport England team) to ensure that composite variables were defined correctly for the key activity measures.

    Comparisons were made between different participation measures to check that the way in which they related was consistent with how they had been defined.

    Where problems were found, the syntax was corrected, the variables recreated and the checks repeated to ensure that the final data were correct.

    In a small number of cases inconsistencies in data were found for individual cases. These were investigated and found to relate to likely back tracking in the questionnaire leading to small inconsistencies.

    No cases were deleted as a result of these checks, as the resulting data was not significantly affected.

    Read less about Checking activity and sports measures
  • Checking demographic variables

    Demographic variables were checked primarily by cross tabulation of the raw variables against the derived variables.

    A sense check was applied to variables to ensure that the frequencies ‘looked’ right.

    Finally, the demographic variables were checked against each other to ensure that they were internally consistent.

    This included checking that age bands tallied across variables and that derived variables which used the same source data contained the same number of valid responses.

    Checks were also made by comparing the pattern of activity by demographic group across all the survey years.

    Read less about Checking demographic variables

Sign up to our newsletter

You can find out exactly how we'll look after your personal data, but rest assured we'll only use it to make sure you receive our newsletter, to understand how you interact with our newsletter, and to provide administrative information about our newsletter.