Quo Vadis Raw Data?



Volume 33
Issue 12
Pages: 8–11

Regulatory definitions can be confusing at all levels. In this article clear definitions of raw data are clarified and compared. A discussion of static and dynamic data is given along with a detailed summary of the impact of these terms on raw data.

We revisit raw data following the publication by the UK’s regulatory agency of a new data integrity guidance, where the raw data definition contradicts that in the Good Laboratory Practice regulations. In addition, we discuss static and dynamic data, and examine the impact of these terms on raw data.

Raw data is a topic that I have previously discussed in this "Focus on Quality" column, and in my "Questions of Quality" column in LCGC Europe since 1996. Let me give you a brief history of raw data from my perspective. Then, we will discuss a recent problem with the definition of raw data with the publication of the Medicines and Healthcare products Regulatory Agency (MHRA) of their new GXP data integrity guidance. It raises the question: Where is the definition of raw data going?

In the Beginning

My first column on raw data, published in December 1996, argued that electronic records and not paper printouts from a computerized system were raw data (1). Unfortunately, the timing was just before the publication of 21 CFR 11 on electronic records and electronic signatures (2). I revisited the topic in 2000 and considered the impact of the new regulation, but reinforced the position that electronic records were the raw data rather than paper printouts (3). Twelve years later, the subject was discussed again (4) with the publication on the FDA's web site of their explanation of why some in the pharmaceutical industry misinterpret 21 CFR 11 regulations and state that paper printouts are their original records (5). On the web site, the FDA used clauses 211.68(b) and 211.180(d) of the Good Manufacturing Practice (GMP) regulations (6) to state that paper printouts were not representative of the electronic records, as printouts were not "exact and complete," nor "true copies," respectively. There was much more information in the electronic records.

In this column, we have also considered an interpretation of the GMP term complete data for laboratory records in 21 CFR 211.194(a) (7). The conclusion was that all records generated during any analysis must be included as complete data. There is a small clue to help your interpretation with the word complete. In August 2016, the FDA published the proposed update to the existing Good Laboratory Practice (8) as the GLP Quality System (9). This triggered my last discussion of raw data in this column (10), partly because of this new proposed regulation, but also because the European Union (EU) GMP Chapter 4 on documentation uses the term raw data, but fails to define it (11). Chapter 4 is currently being revised with Annex 11 to reinforce data integrity principles (12).

Original Raw Data Definition

Although used in GMP, raw data is a GLP term that was first defined in the 1978 GLP regulations (8) in 21 CFR 58.3(k) as:

“any laboratory worksheets, records, memoranda, notes, or exact copies thereof, that are the result of original observations and activities of a nonclinical laboratory study, and are necessary for the reconstruction and evaluation of the report of that study.” [emphasis added]

The key points from this definition are that raw data consisted of:

  • original observations

  • all data generated, derived or interpreted from that point forward to the report of the work and that by implication, it should also be possible to trace a result in the report back to the original observations.

In the same column, we were able to equate and harmonize raw data with complete data. Quod erat demonstrandum!

But a problem has arisen now with the publication of the MHRA GXP data integrity guidance (13).

What’s the Problem?

The final version of the MHRA’s GXP data integrity guidance was issued in March 2018 (13). This is a much better presented and more rounded document than the draft from 2016 (14) and the two versions of the GMP document published in 2015 (15,16).

In the new MHRA GXP guidance (13), raw data are defined in section 6.2 as:

“the original record (data) which can be described as the first- capture of information, whether recorded on paper or electronically.”

According to the MHRA, raw data only consists of the first-capture of information, and this equates to original observations in United States (US) GLP above (8). In the MHRA definition, there is no mention of:

“. . . and activities….necessary for the reconstruction and evaluation of the report . . .”

which leaves a divergence between the MHRA definition and the original GLP definition (8) presented above and discussed in my 2016 column (10).

To be fair to the MHRA, in the explanation section for raw data there is the following statement:

“Raw data must permit full reconstruction of the activities.”

There is the requirement for reconstruction of activities that comes closer to the GLP definition in 21 CFR 58 (8). However, the major issue, from my perspective, is that the explanation is not incorporated in the definition itself. Readers tend only to focus on the definition and rarely read or remember the explanation.


OECD GLP Regulations and Raw Data

The MHRA is responsible for enforcing the GXP regulations used by regulated firms within the UK. Applicable regulations for laboratories operating to Good Manufacturing Practice are either EU GMP Parts 1 and 2 and are applicable for regulated laboratories testing finished products, intermediates and raw materials or active pharmaceutical products respectively. The issue of a lack of raw data definition in EU GMP Chapter 4 on documentation was discussed in the 2016 column (10). For laboratories operating under Good Laboratory Practice, the applicable regulations enforced by the MHRA’s Good Laboratory Practice Monitoring Authority (GLPMA) are the Economic Co-operation and Development (OECD) Principles of GLP (OECD GLP) regulations and associated advisory documents.

OECD Principles of Good Laboratory Practice (17) define raw data in section 2.2.7 as:

“all original test facility records and documentation, or verified copies thereof, which are the result of the original observations and activities in a study.”

There is also a paragraph expanding the type of records that could be generated and are capable of being retained for the record retention period for nonclinical studies. You will notice the italic text in the definition of raw data: original observations and activities in a study. The definition does not go quite as far as US GLP in that the requirement for “reconstruction and evaluation of the study” (8) is missing or implied. However, it is wider in scope than the MHRA definition of raw data in their 2018 data integrity guidance (13).

Could the OECD regulations be wrong? After all, the regulations are 20 years old. Let us consider what is contained in the OECD series of Advisory Documents on Principles of Good Laboratory Practice and Compliance Monitoring. Advisory Document number 17 is entitled “Application of GLP Principles to Computerized Systems,” and was issued in 2016 (18). This document is essentially Annex 11 on steroids. In Appendix 2 of the glossary, there is the following definition for data (raw data):

“The GLP Principles define raw data as all laboratory records and documentation, including data directly entered into a computer through an automatic instrument interface, which are the results of primary observations and activities in a study and which are necessary for the reconstruction and evaluation of the report of that study.”

We have from this definition that raw data consist of:

  • all laboratory records and documentation

  • data directly entered into a computer through an automated interface

  • results of primary observations and activities in a study

  • [data] necessary for the reconstruction and evaluation of the report.

Notice the similarity with the US GLP definition of raw data (8)? Apart from the use of primary observations rather than original observations, it is virtually the same and has a similar meaning.

This now begs a question: The MHRA definition of raw data in their new guidance document does not appear to be tenable as it is in direct contradiction with the OECD regulations (17) and advisory document 17 (18) that the MHRA themselves must enforce. If the MHRA cannot interpret the regulations that they must enforce, what hope is there for regulated laboratories? Where do we go from here?

MHRA to the Rescue?

You may think I have taken leave of my senses having spent time until now critiquing the MHRA definition of raw data. Help is at hand and it comes, rather surprisingly, in MHRA’s own GXP data integrity guidance document (13). Fast forward through the document and arrive at section 6.11, which discusses original record and true copy.

Definition of Original Record?

The definition of original record is

“The first or source capture of data or information, e.g. original paper record of manual observation or electronic raw data file from a computerized system, and all subsequent data required to fully re-construct the conduct of the GXP activity,” (13).

Let us look at this definition in some more detail and see the component parts in comparison with the US GLP definition:

  • first or source capture of data or information: this equates to original observations or primary observations

  • covers both paper and computerised systems: this is implicit in the original GLP definition

  • all subsequent data: this is the same as activities

  • to fully reconstruct the conduct of the GXP activity: equates to reconstruct and evaluate the study.

Ignore the heading; this is a great definition of raw data. It is a pity that the definition is called original record. In fact, if you swapped the titles of the two MHRA definitions (raw data becomes original record and vice versa), all would be great.


Putting Raw Data in Context

Although this column has discussed raw data extensively, it is important not to forget the that raw data includes all the associated metadata. In addition, the integrity of the records must be complete and consistent and accurate throughout the data life cycle. You will recall that a flexible analytical data life cycle was presented and discussed in a recent Focus on Quality column (10).

Metadata is defined by the MHRA guidance section 6.3 (13) as:

“data that describe the attributes of other data and provide context and meaning. Typically, these are data that describe the structure, data elements, inter-relationships and other characteristics of data, e.g. audit trails. Metadata also permit data to be attributable to an individual (or if automatically generated, to the original data source).”

In the explanation section, it notes that (13):

  • metadata form an integral part of the original record, and

  • without the context provided by metadata, the data have no meaning.

Therefore, when considering raw data (or in the context of the MHRA-original record), all metadata, including all pertinent audit trail entries, are essential for the transparent reconstruction of any GXP activity, from initial acquisition to report vs. record as well as report vs. record to initial acquisition. This is a different way of considering complete data-all records and supporting contextual metadata from sampling to reporting. In addition, data governance impacts on raw data such as open culture, data integrity policies and procedures, training, and roles and responsibilities such as data ownership, and these are discussed in more detail in a new book on Data Integrity and Data Governance for regulated laboratories (19).

Revision of EU GMP Chapter 4

As I mentioned earlier in this column, EU GMP chapter 4 uses the term raw data, but does not define it (11). Currently, the European Medicines Agency (EMA) has a project to revise Annex 11 and Chapter 4 to incorporate more emphasis on data integrity (12). The revised chapter must have a definition of raw data and an option could be to adapt the MHRA’s original record definition, such as:

“The first or source capture of data or information and all subsequent activities including metadata required to fully reconstruct the conduct of a GXP activity.”

One of the problems with writing the existing EU GMP Chapter 4 was the difficulty in agreeing on a definition that would fit with manufacturing, quality control, and quality assurance activities. The suggested definition above could be a starting point for discussions to agree on a suitable definition.

Static versus Dynamic Data

The terms static data, dynamic data, and record format appear in the MHRA, World Health Organization (WHO), FDA, and Pharmaceutical Inspection Cooperation Scheme (PIC/S) (14-16, 20-22) data integrity guidance documents, as well as recent MHRA GXP guidance (13). As there are statements about static and dynamic data concerning raw data and original record in the MHRA GXP guidance, it is pertinent to discuss the terms here and how they impact raw data.

Section 6.11 of the MHRA states that “original records can be static or dynamic” (13).

From the discussion above, we can equate original records to raw data. The difference between static and dynamic data needs to be discussed. The FDA’s data integrity guidance has the following question: How does FDA use the terms static and dynamic as they relate to record formats? (22):

“For the purposes of this guidance, ‘static’ is used to indicate a fixed-data document such as a paper record or an electronic image, and ‘dynamic’ means that the record format allows interaction between the user and the record content.”

Typical examples of static data could be a pH measurement, the printout of a weight from an analytical balance, a photograph, or a temperature of an environmental chamber. Although the value could be averaged in the case of a set of temperature measurements or involved in further calculations with a weight, the record itself cannot be changed. In contrast, dynamic data allows a user interaction such as interpretation of a spectrometer spectrum or chromatogram. Dynamic records carry a higher regulatory risk, due to the possibility of interpretation, manipulation, or falsification into compliance.

The MHRA section 6.2 on raw data has two statements on dynamic and static data (13):

“Information that is originally captured in a dynamic state should remain available in that state (Definition). Raw data must permit full reconstruction of the activities. Where this has been captured in a dynamic state and generated electronically, paper copies cannot be considered as ‘raw data’ (Explanation).”

Therefore, if data are captured in dynamic state, they must remain in that state with obvious implications for record retention. Printing to paper is not an option. This is a reinforcement of the FDA’s position from 2010 published on their web site (5).

But, at some point in the future, some of your laboratory records and data might need to be converted to a static format, due to a lack of suitable hardware or software to read the records. That is another discussion for another column.



The 2018 MHRA GXP data integrity guidance has a definition of raw data that conflicts with those in the GLP regulations. However, the definition of original record from the guidance is an adequate substitute and could be used instead. The impact of static and dynamic data types on raw data are also discussed.


I would like to thank Chris Burgess and Mark Newton for comments made during the writing of this column.


(1) R.D. McDowall, LC-GC International, 9(12), 790–793 (1996).

(2) 21 CFR 11 Electronic records; electronic signatures, final rule, in Title 21 (Food and Drug Administration: Washington, DC, 1997).

(3) R.D. McDowall, LC-GC International 13(9),: 648–657 (2000).

(4) R.D. McDowall, LCGC Europe 25(2), 88-102 (2012).

(5) FDA Questions and Answers on Current Good Manufacturing Practices, (Good Guidance Practices, Level 2 Guidance - Records and Reports. 2010 27 May 2016). Available from: http://www.fda.gov/Drugs/GuidanceComplianceRegulatoryInformation/Guidances/ucm124787.htm.

(6) 21 CFR 211 Current Good Manufacturing Practice for Finished Pharmaceutical Products. (Food and Drug Administration: Silver Springs, MD, 2008).

(7) R.D. McDowall, Spectroscopy 28(4), 18-25 (2013).

(8) 21 CFR 58 Good Laboratory Practice for Non-Clinical Laboratory Studies. (Food and Drug Administration: Washington, DC, 1978).

(9) 21 CFR Parts 16 and 58 Good Laboratory Practice for Nonclinical laboratory Studies; Proposed Rule. (Federal Register, 81(164), 58342 - 58380 (2016).

(10) R.D. McDowall, Spectroscopy 31(11), 18-21 (2016).

(11) EudraLex - Volume 4 Good Manufacturing Practice (GMP) Guidelines, Chapter 4 Documentation. (EU Commission, Editor, Brussels, 2011).

(12) Work plan for the GMP/GDP Inspectors Working Group for 2018 (European Medicines Agency, London, 2017).

(13) MHRA GXP Data Integrity Guidance and Definitions. (Medicines and Healthcare products Regulatory Agency, London, 2018).

(14) MHRA GXP Data Integrity Definitions and Guidance for Industry, Draft version for consultation July 2016. (Medicines and Healthcare products Regulatory Agency, London, 2016).

(15) MHRA GMP Data Integrity Definitions and Guidance for Industry 2nd Edition. (Medicines and Healthcare Products Regulatory Agency: London, 2015).

(16) MHRA GMP Data Integrity Definitions and Guidance for Industry 1st Edition. (Medicines and Healthcare Products Regulatory Agency, London, 2015).

(17) OECD Series on Principles of Good Laboratory Practice and Compliance Monitoring Number 1, OECD Principles on Good Laboratory Practice. (Organisation for Economic Co-operation and Development, Paris, (1998).

(18) OECD Series on Principles of Good Laboratory Practice and Compliance Monitoring Number 17. Application of GLP Principles to Computerised Systems. (Organisation for Economics Co-Operation and Development, Paris, 2016).

(19) R.D. McDowall, Data Integrity and Data Governance: Practical Implementation in Regulated Laboratories. (Royal Society of Chemistry, Cambridge, 2018).

(20) PIC/S PI-041 Draft Good Practices for Data Management and Integrity in Regulated GMP / GDP Environments. (Pharmaceutical Inspection Convention / Pharmaceutical Inspection Co-Operation Scheme, Geneva, 2016).

(21) WHO Technical Report Series No. 996 Annex 5 Guidance on Good Data and Records Management Practices. (World Health Organisation, Geneva, 2016).

(22) FDA Draft Guidance for Industry Data Integrity and Compliance with cGMP. (US Food and Drug Administration, Silver Spring, MD, USA, 2016).

R.D. McDowall is the director of R.D. McDowall Limited and the editor of the “Questions of Quality” column for LCGC Europe, Spectroscopy’s sister magazine. Direct correspondence to: SpectroscopyEdit@UBM.com