2.2. Experimental Design (Dr. Frank Dieterle)

Frank Dieterle

Ph. D. Thesis

2. Theory – Fundamentals of the Multivariate Data Analysis

2.2. Experimental Design

Home
News
About Me
Ph. D. Thesis
	Abstract
	Table of Contents
	1. Introduction
	2. Theory – Fundamentals of the Multivariate Data Analysis
		2.1. Overview of the Multivariate Quantitative Data Analysis
		2.2. Experimental Design
		2.3. Data Preprocessing
		2.4. Data Splitting and Validation
		2.5. Calibration of Linear Relationships
		2.6. Calibration of Nonlinear Relationships
		2.7. Neural Networks – Universal Calibration Tools
		2.8. Too Much Information Deteriorates Calibration
		2.9. Measures of Error and Validation
	3. Theory – Quantification of the Refrigerants R22 and R134a: Part I
	4. Experiments, Setups and Data Sets
	5. Results – Kinetic Measurements
	6. Results – Multivariate Calibrations
	7. Results – Genetic Algorithm Framework
	8. Results – Growing Neural Network Framework
	9. Results – All Data Sets
	10. Results – Various Aspects of the Frameworks and Measurements
	11. Summary and Outlook
	12. References
	13. Acknowledgements
Publications
Research Tutorials
Downloads and Links
Contact
Search
Site Map
Print this Page

2.2. Experimental Design

Having defined the type and concentration range of the analytes of interest and of the additional factors like temperature or humidity (or generally the independent variables), a plan has to be setup, which determines the number and compositions of the samples to be measured. This plan is known as experimental design in chemometrics. The experimental design tries to cover optimally the space spanned by the independent variables with as few samples as possible to understand the effects of these variables and to model the relationships between the dependent and independent variables. Among the many existing types of experimental designs, several designs are specialized for optimization strategies like the Central Composite Designs, Doehlert Design or Box-Behnken Design [2],[3], several designs are mixture designs when all components add up to 100% and several designs such as the D-optimal designs [4]-[6] are specialized for a constrained variable space. In this study, the concentrations of the different analytes should be independently varied and the number of concentration levels and thus the number of samples should not be constrained rendering most of these designs useless. Thus, full factorial designs are used, which combine all levels of all independent variables (all defined concentration levels of all analytes). This results in a rapidly increasing number n of samples for an increasing number x of analytes and for an increasing number l of concentration levels per analyte:

(1)

In this work, full factorial designs with and without equidistant levels are used for the calibration data sets. For most validation data sets, also full factorial designs are used. Thereby, the meshes of the two designs are interleaved with a maximum distance of the meshes allowing the validation data to give a realistic estimation of the network performance in a real-world situation [7].

Page 31