Batch yield and purity | The two columns in the data set are: the percentage yield from a batch reactor, and the purity of the feedstock.
The feedstock is what we add to the reactor, and the yield is measured after the reaction is completed.
The cause-and-effect direction is that the purity of the feedstock has (potential) impact on the yield. | 241 | 2 | multivariateregressionleast-squares |
Bioreactor yields | The percentage yield from a bioreactor given the temperature, impeller speed, duration, and whether or not the reactor has baffles. | 14 | 5 | multivariatecategoricalregression |
Blender efficiency | The effect of 4 factors on blending efficiency. | 18 | 5 | multivariatedoe |
Brittleness index | A plastic product is produced in three parallel reactors (TK104, TK105, or TK107). For each row in the dataset, we have the same batch of raw material that was split, and fed to the 3 reactors. These values are the brittleness index for the product produced in the reactor. | 15 | 3 | multivariatemissing-datapaired |
Certificates of analysis | Four properties of an important powder raw material were transcribed from the supplier's certificates of analysis. | 122 | 5 | multivariatemonitoring |
Cheddar cheese | Concentrations of acetic acid, H2S, and lactic acid in 30 samples of mature cheddar cheese. A subjective taste value is also provided. | 30 | 4 | multivariateregression |
Class grades | Grades from a Chemical Engineering course at McMaster University. | 99 | 6 | multivariatemissing-dataregression |
Distillation tower | Snapshot measurements on 27 variables from a distillation column; measured over 2.5 years. | 253 | 27 | multivariateoutliersregression |
Film thickness | The thickness of a plastic film is measured in 4 positions after being cut. The position of the measurements are top right, top left, bottom right and bottom left. | 160 | 4 | multivariate |
Flotation cell | Data from a zinc-lead flotation cell measured on 5 variables; recorded from the PLCs. | 2922 | 5 | multivariatetime-series |
Food consumption | The relative consumption of certain food items in European and Scandinavian countries. The numbers represent the percentage of the population consuming that food type. | 16 | 20 | multivariatemissing-data |
Food texture | Texture measurements of a pastry-type food. | 50 | 5 | multivariate |
Kamyr digester | Pulp quality is measured by the lignin content remaining in the pulp: the Kappa number. This data set is used to understand which variables in the process influence the Kappa number, and if it can be predicted accurately enough for an inferential sensor application. | 301 | 22 | multivariatemissing-datatime-series |
LDPE | Data from a low-density polyethylene production process. There are 14 process variables and 5 quality variables (last 5 columns). | 54 | 19 | multivariateoutliersmonitoring |
Oil company DOE | Experimental data; testing amount of 4 materials added (A, B, C, D) in order to achieve a certain volumetric heat capacity, y. | 19 | 5 | multivariatecategoricaldoeregression |
Peas | The taste of 27 pea varieties as measured by judges. After blanching the peas, the peas were quick-frozen, packed into bags and stored for 3 months. | 60 | 17 | multivariate |
Raw material outcome | Six characterizing measurements for batches of plastic pellets; the outcome when using this material, either Poor or Adequate, is also provided. | 24 | 7 | multivariatecategoricalregression |
Raw material properties | Measured characteristics of several batches (lots) of plastic pellets. | 36 | 6 | multivariatemissing-data |
Room temperatures | Temperature measurements, in Kelvin, taken from 4 corners of a room. | 144 | 4 | multivariate |
Sawdust | Sawdust from birch, pine a spruce were blended in specific ratios. The corresponding NIR spectra are recorded. | 54 | 1204 | multivariate |
Silicon wafer thickness | Thickness of a single wafer, measured at 9 locations for 184 consecutive batches. A single wafer is removed from a tray of wafers (always at the same position for each batch of wafers) after the chemical vapour decomposition process is complete. | 184 | 9 | multivariateoutliers |
Six-point board thickness | Thickness of 2x6 SPF boards from a saw mill. | 5000 | 6 | multivariate |
Solvents | Physical properties of various chemical solvents, such as melting point, boiling point, dipole moment, refractive index, density, solubility. | 103 | 9 | multivariate |
Systematic method | Data are from an open-ended question on a final exam. The values are the grades achieved for the answer to that question, broken down by whether the student used a systematic method, or not. No grades were given for using a systematic method; grades were awarded only on answering the question. | 44 | 2 | multivariate |
Tablet NIR spectral data | Spectra, measured in the transmittance mode, of 460 pharmaceutical tablets; readings are from 600 to 1898 nm in 2 nm increments. | 460 | 650 | multivariate |
Travel times | A driver uses an app to track GPS coordinates as he drives to work and back each day. The app collects the location and elevation data. Data for about 200 trips are summarized in this data set. | 205 | 13 | multivariatemissing-data |
Wine DOE | Data from a fractional factorial for profiling a new wine. The last 5 columns are the taste values from a panel of judges. Higher values are a better overall taste. | 16 | 13 | multivariatedoeregression |
Wood fibres | A sample of aspen tree fibre as characterized by a fibre quality analyzer (FQA). | 25165 | 6 | multivariate |