Laboratory data, its Management and Analysis

Scientific Journal

Information and data are collected in all sort of forms and all the time. Twenty first century is all about gathering information and data. The data collected may become irrelevant, if it is not adequately processed. Unprocessed data is an expenditure of time, money, and manpower invested in collecting those, without yielding any conclusions that could have been possibly drawn. Data collected, analyzed, and interpreted in a proper way will give an insight into the challenge and the associated solutions that can be offered.

Laboratories are major sources of systematic information and data.  Laboratory data are of two types. The first one is an internal one which mainly deals with internal quality control and management. The second one is the lab output data, like analysis results or reports.

Internal data

A vast amount of internal information and data are generated due to implementation of robust quality management system (QMS) that enables the laboratory to assure its quality. The QMS covers each aspect of the laboratory that could influence either directly or indirectly the integrity of the analytical results.

Standard  operating procedures (SOP) need to be documented and catalogued for each and every process starting from sample receipt; sample handling; analytical procedures;  procedures for equipment, chemical, and lab-ware handling and management;  traceability procedures; laboratory safety management.

 Personnel management:  Knowledge and skills of laboratory technical staff affect the quality of the results produced. To guarantee this aspect, the laboratory should be confident that the personnel involved are capable of conducting the methods correctly.

Validation analysis results: The validation analysis will prove the principle of the method is correct. Validation is achieved by determining the following parameters delineating the quality of the method:

• Limit of detection (LOD) and Limit of quantification (LOQ) are the lowest concentration that can be identified and measured, respectively.

Accuracy describes the difference between the result found by the laboratory and the true value in a sample.

• Precision describes the variation in the results found by the laboratory in the same sample. Precision can be estimated at the same time and under the same conditions, which is repeatability, and at different times and conditions, which is intra-laboratory reproducibility.

• Linearity describes the upper limit for which the concentration shows a linear relationship with the measured signal. This parameter is only relevant with a calibration curve.

• Selectivity describes the influence of other components on the measured signal.

• Sensitivity describes the quantitative relationship between the analyte and the measured signal.

• Robustness describes the effect of variation in the procedure on the measured signal.

• Stability describes the change of concentration of an analyte over time, stored at specific conditions, such as temperature and pressure. Calculation of these parameters requires analytical and statistical knowledge and depends on the type of methodology utilized.

Equipment: The equipment affects the quality of the analytical results produced. It is of prime importance to ensure that equipment is working according to the specifications. For this purpose, the laboratory should focus on regular maintenance and performance checks. The equipment logbook maintains all the records of maintenance, problems and results of performed test for the equipment involved, the extent of use of the equipment and names of the users.

Control of the results:  The routine lab activities are influenced by random variations in the results introduced during execution of a procedure. Therefore, the laboratory should constantly check the quality of the results produced by the implementation of a first line of control procedure. The basic principle is to analyze a control standard within each batch of samples and use its result to judge the quality of the other results in the batch.

Shewhart-charts or control charts are most commonly used to record these results, including acceptance criteria and relevant statistical information. These charts are very powerful tools for a continuous check on the quality of the results, and also reflect changes in the performance of the method over time. The data obtained by the control charts, however, only recognizes problems at batch level and is therefore not guaranteed to avoid errors in individual samples. For this purpose, it is advisable to conduct analyses in duplicate in the same (repeatable conditions) or different batches (intra-laboratory reproducible conditions). Analyzing samples in different batches is statistically the best, but not always the most efficient approach. Quality of the results should also be monitored using set criteria for error percentages allowable between duplicates for each sample type.

Participation in collaborative trials (proficiency testing) and the analysis of Certified Reference Material (CRM) give valuable information about the accuracy of the results of the laboratory. The results found in collaborative trials are mostly evaluated by a z-score, which is calculated as the difference between the found value and the average divided by the overall standard deviation, and should be between -2 and +2. The z-score should be used to identify the presence of systematic errors, and the determination of the uncertainty of measurement.

Traceability: Traceability of all activities in the laboratory is necessary to guarantee the quality of each individual result. To prove correct performance of the laboratory activity, all vital information must be recorded and stored.  A traceability data should include following parameters -

• Condition of sample on arrival

• Information about the control samples analyzed

• All raw data produced

 • Name of laboratory technician for each analysis conducted

• Identification of equipment used

• Signature for acceptance and date the report is sent to client

• Any communication from the client.

All critical factors should be traceable from the information on the worksheet. If necessary, the laboratory can prove that the technician was authorized to conduct this determination, the equipment had the correct calibration status, calculation of results was correct, samples were connected to the first line of control, and the results were authorized by qualified personnel. It is essential that all of this information is available and stored correctly for a specified period by the laboratory. The storage time for information depends on legal and customer requirements, but, in general, a storage period of 5 years is acceptable.

Document Control: The quality control system requires a great number of controlled official documents to describe all aspects of the process. These documents have to be updated regularly and the organization must ensure that only the most recent version is available and in use by personnel.

Output data or report data

The primary aim of any laboratory is to generate the results or reports. A huge quantity of data are generated in the process. The output data usually consists following parameters - sample description, source, date of collection, date of analysis and report, analysis method, result and comment on the result.   These output data can be classified into two different categories

Results for customers – Usually this is the data generated by commercial laboratories. The data generated are usually not structured.

Results for R & D – Usually generated by R & D centers, academic institutes. The data generated are usually more structured.

Laboratory data management

Laboratory data management is a challenging operation. A successful modern laboratory data management needs robust management system, an electronic lab note book and data repository for internal data storage.

Current lab management platforms consists of Laboratory Information Management Systems (LIMSs) and Electronic Laboratory Notebooks (ELNs). A third tool called Internal Data Repository (IDR) is emerging to store the vast amount of data generated.

LIMS mainly focuses on managing the laboratory operations. It is originally designed to improve lab efficiency by managing workflows relating to samples and associated data. Nowadays, LIMS is being used in managing workflow, record keeping and inventory management and thereby help standardize the operations, tests and procedures, while providing controls to the process. 

ELNs are the digital successors of paper-based laboratory notebooks, and are sufficiently secure and reliable to serve as reference in legal matters. Ideally, ELNs cover all events accompanying a sample analysis, and an overview of all run experiments. Most ELNs allow a manual upload of the representation of experimental data in some form (e.g. pdf files, spread sheet, text form), but fall short of handling experimental data and metadata storage adequately.

A dedicated data repository is essential to adequately keep vast laboratory data findable and accessible, with enough metadata to ensure its reusability. A flexible permissions system is required to regulate user access, together with search and filter functionalities for shifting through datasets. It is very essential to gather the raw data comprehensively with automated process directly from instruments. Calibration and other validation data should be filtered out during routine search operations, yet available when potentially needed. 

A comprehensive LIMS platform can be created with the integration of ELN and data repository system in order to allow a seamless management of whole laboratory system and integration of automatically gathered measuring data and manually entered protocols.

Big data

Individually the customer data may not be having any significance but collectively the data may reveal something unique. This is the way the concept of big data comes into picture.

Concept of big data has been around for years; most organizations now understand that if they capture all the data that flows into their businesses, they can apply data analytics and provide back valuable insights to the industry. Even as early as 1950s, businesses were using basic analytics by means of manual examination to uncover insights and trends. 

New big data analytics brings about speed and efficiency in business. Whereas a few years ago business would have gathered information, and analyzed data conventionally to unearth the information to use for future decision. Today’s big data analytics can identify the insights for immediate decision. The ability to work faster and stay agile, gives the organizations a competitive edge they did not have before. Big data analytics helps organizations harness their data and use it to identify new opportunities. That, in turn, leads to apt customized solutions, more efficient operations and happier customers.

The big data analysis of laboratory data has helped in varied fields like weather forecasting, crop management, management of diseases and epidemics in humans and animals, mycotoxin risk management and precision feeding in animal nutrition.