Carrying out secondary data analysis poses various ethical and methodological dilemmas for researchers including issues with informed consent, so why do it? 

The reality of the current times is that data archives exist now and researchers need to be aware of the changes and adaptations that need to be effected to the design, methodologies and implementation of research to be able to respond and adapt efficiently to this new reality. Data archives are an innovative way of disseminating data and providing material than can be effectively used for research and teaching (Bishop, 2009; Bishop, 2012) by utilising information technologies (Parry and Mauthner, 2004).

Research has identified benefits and advantages of carrying out secondary data analysis specifically. One of these benefits is the expectation of research data to be transparent and reproducible as open data can increase evaluation and reproduction of research (Roche et al., 2014) as well as provide a platform for auditing the quality and reliability of research data and findings.

Research is often publicly funded and, therefore, secondary data analyses can be a way of aggregating value to the original investment, increasing the cost-efficiency of funded research (Camfield and Palmer-Jones, 2013; Roche et al., 2014). Another benefit of secondary analyses is that it prolongs the use of data over time and participants will often engage in research not only to contribute to a particular project, but also to the broader body of knowledge (Bishop, 2013). Researchers have a duty to benefit society through their work by improving policy and practice; secondary data analysis can be another means to achieve this goal with reduced risk to participants (Bishop, 2009).

This secondary data analysis was carried out with young people involved in the Big Brother Big Sister mentoring programme, which have been described as a ‘vulnerable’ population which may experience poor social skills, low self-esteem and/or economic disadvantage (Dolan, Brady, O’Regan, Brumovska, Canavan and Forkan, 2010). This posed further challenges to the justification of carrying out such an analysis despite the criticisms and challenges that exist towards re-using data.

The original research study evaluated the benefits of mentoring relationships between an adult and a young person by focusing on the role of social supports, emotional well-being, education, risk behaviour, relationships and outcomes of matching a young person and an adult (Dolan, Brady, O’Regan, Brumovska, Canavan and Forkan, 2010).

This secondary analysis was focused on exploring one aspect of mentoring relationships that had not been explored previously: the role of empathy in mentoring relationships. It was, therefore, considered that the context of both research questions was similar enough for the data to be valuable and useful in providing further understanding of what the primary data had already achieved about mentoring relationships and their impact on young people. The secondary analysis would build on the first analysis to provide further evidence of how to improve the mentoring programme, maximise the benefits for young people and expand on the body of knowledge to inform policy and improve practice in the field.

Consent and informed consent

The first major issue that this secondary data analysis had is that participants were not asked to consent for their data to be archived or included in further research. This was a limitation that had to be carefully dealt with. Original participants gave consent to a research study that in essence provided further understanding of mentoring relationships, supports, benefits and outcomes. The secondary data analysis is expanding further on this knowledge by introducing a new variable: empathy. It was considered that the purpose of the primary analysis also covered the purpose of the secondary analysis and was, therefore, deemed viable. According to Bishop (2013) some secondary researchers have argued that they are entitled to use data in new ways as long as confidentiality and integrity of the data are not breached at any time. Issues with consent were approached with the selection of secondary data analysis instead of archiving the data. Secondary data analysis allowed original researchers to be involved and ‘supervise’ the type of analysis that was carried out with the data.

Secondary researchers are aware that ideally consent should have been sought from participants at the time the data was collected. However, the next issue is determining if consent at the time of primary data collection is really ‘informed’ as researchers may not foresee specifically what the archived data or secondary data analyses in the future will be before the research is designed (Bishop, 2012). Bishop (2014) argues that researchers need to inform participants of the benefits and risks of taking part in research, but this does not mean that researchers are in a position to decide ‘what is best’ for participants. Therefore, consent should be sought and explanations of potential uses of the data should also be specified to help participants make their own informed, if limited, decisions.

Another issue of informed consent may be less explored in the literature: researcher consent. Consent from researchers, as well as participants, needs to be sought because where data is constructed mutually, both parties contributed to the production of data and therefore share ownership. More importantly, Camfield and Palmer-Jones (2013) would also argue that researchers may reveal and report personal information that may have contributed to build rapport. This information may also be sensitive; therefore, informed consent needs to be sought from participants and original data collectors and analysts, particularly in the case of secondary data analysis, as archived data is usually covered by data licenses agreed with the repository.

 The ‘voice’ of young people

One important aspect of carrying out primary and/or secondary research is considering what impact the research findings will have on participants. Researchers have a specific interest in a field which is important to trigger their motivation and commitment to carry out the research in the first place; however, research needs to have an ethical and responsible approach to safeguard the well-being of young people in this case both during, and subsequent, to the research. Secondary data analysis can be used to develop insight into hard to reach or vulnerable populations by reducing the level of potential participant distress (Irwin, 2013). Participants should be exempt from unnecessary intrusion, if primary data that can answer a research question already exists, collecting more unnecessary data can be an intrusion (Bishop, 2009).

Exploratory secondary data analyses can provide insights to the perspectives and views of young people. Secondary researchers may be able to answer research questions by carrying out secondary data analysis which means further primary research may not be required. This saves time and funding resources while making an effective contribution to the body of knowledge. If the secondary analysis is not sufficient to fully answer a specific research question, this can still inform future research that has to be carried out, but it will be a more targeted and effective way so that the research question can be fully answered. This would be an ethical approach to research as participants do not need to take part in unnecessary research processes. In terms of investment, funding can be targeted at specific needs and gaps in the research that existing databases definitely cannot provide.

Young people in this secondary data analysis were not explicitly asked about the topic of interest. One of the concerns of the secondary researchers was that this could lead to a limited understanding of the topic, as it was not directly approached it would have to be inferred.

Secondary data analysis requires a level of interpretation of the data which may or may not achieve the depth and accurate representation of what a young person might have said if asked about the topic directly. Secondary data analysis as well as, or moreover, primary data, needs to have a rigorous, transparent and replicable analytical process to support the findings. Research findings need to be clearly supported by evidence identified in the primary data to ensure that the voices of young people are accurately captured and understood, avoiding bias from the researchers’ interpretation.

One of the ethical considerations in this secondary data analysis was to include young people in the dissemination of the findings of the secondary analysis. Access to the original cohort of young people was not possible and, also, they would have out-grown this developmental stage. It was decided to include a youth advisory group consisting of young people currently involved in mentoring relationships to inform the dissemination of the findings and be the ‘voice’ of young people in issues that matter to them from their own perspective and approved by them. This would also provide an opportunity to audit and validate the findings and ensure that the current needs of young people are captured to inform policy and the original services targeted will continue to be improved by the data that was originally commissioned for the evaluation of their services. The findings will be adapted to the current service users to ensure that it is relevant and used for the ‘public good’ (Bishop, 2009).

Losing the context

One of the biggest concerns regarding secondary data analysis is the importance of context and rapport in the generation of qualitative research. Qualitative data can be defined as a mutual construction between researchers and participants, which is not possible in secondary data analysis (Irwin, 2013; Parry and Mauthner, 2004). Bishop (2012) argued that secondary data needs to include extensive and detailed descriptions of the context of the primary data. However, this does not mean that the original context can be or should be reproduced in secondary data analyses. Since the current tendency of research towards archiving, researchers need to start accurately systematising, in detail, the context of their primary data collection.

Another issue related with methodological approaches and context is the appropriateness of specific methodologies to undertake secondary data analysis. According to Irwin (2013) only certain methodologies are appropriate to carry out secondary data analysis. In ethnography, for example, the researcher is involved in the setting to such an extent that data becomes a product and possession of the researcher, limiting the possibilities of analysis by external researchers. Semi-structured interviews can produce data that is more independent of the primary researcher, suggesting that the assumptions and data generation is more evident and transparent (Irwin, 2013). In this secondary data analysis, primary data was obtained through semi-structured interviews and, therefore, considered suitable for secondary data analysis.

Different views have emerged describing the removal of distance and emotional detachment from the data which can have benefits for the analysis process (Camfield, Palmer-Jones, 2013). Secondary data analysis may also benefit from a wider contextual data, more resources, and more complex theoretical and methodological approaches (Camfield, Palmer-Jones, 2013). Overall, researchers may develop a better understanding of their topic and acquire new methodological and analytical skills over time. This can contribute to improving their own research and provide a depth of understanding that was not possible at the time of the original data collection and analysis.

In 2007 Moore introduced the concept of ‘recontextualisation’ which emphasizes that all primary and secondary researchers engage in contextualisation. All researchers, independent of involvement in the original data context or not, need to support the claims they make from the data, competing claims exist independently of primary or secondary data (Camfield and Palmer-Jones, 2013; Bishop, 2014). According to Bishop (2014) context is built on the research question, suggesting that the original context may not be relevant at all with the introduction of a new line of inquiry. Therefore, more than ‘losing’ the context, secondary data analysis needs to build a new context that is suitable for the existing data and proposed methodology.

Lack ‘familiarity’ with the data

Another consideration for secondary researchers was limited knowledge of the data at the time of seeking funding, which was crucial to be able to have access to the primary data. At the time of securing funding, researchers may or may not be familiar enough with the data and this can be an issue as the proposed methodology may not be suitable once the data is obtained. This issue may become more relevant if, and when, researchers do not have access to data archived unless they have secured funding. One of the important criteria in applications for secondary data analysis is the innovation aspect of the proposed methodology. However, secondary researchers need to ensure that they can deliver the study as designed and that the ‘promised’ contribution to knowledge is achieved.

One of the possibilities to mitigate the lack of familiarity with the context and the depth of the data is facilitating and encouraging communication between primary data collectors and data re-users (Roche et al.,2014). Dialogue with primary researchers can help secondary researchers access the original context where primary data was generated (Irwin, 2013) and acquire an understanding of the data available even before they have access to it.

In the case of this study, the main researcher had access to the original data collectors, data analyst and principal researchers which provided an opportunity for deeper understanding and clarification of how the data was obtained, analysed and reported.


This paper has provided some insight into the ethical, technical and methodological challenges faced by researchers carrying out secondary data analysis with young people. Researchers are aware that this is not an extensive exploration of the issues, but rather an invitation for further reflection and analyses of the implications of carrying out this type of research using an example from practice.

Secondary data analysis with young people is a cost-efficient and effective way to contribute to the body of knowledge, improve policy and practice while reducing the level of intrusion and distress in vulnerable and hard-to-reach populations such as young people engaged in mentoring relationships. This paper provides some reflection on informed decisions and considerations that were taken to safeguard the well-being and integrity of research participants when carrying out a secondary data analysis with young people.

Secondary researchers agree that ideally participants should have been asked for consent regarding future uses of their data at the time of the original data collection, particularly when considering archiving data and when contact and communication with primary investigators is limited or impossible.