Name Description Rows Columns Tags
Batch yield and purityThe two columns in the data set are: the percentage yield from a batch reactor, and the purity of the feedstock. The feedstock is what we add to the reactor, and the yield is measured after the reaction is completed. The cause-and-effect direction is that the purity of the feedstock has (potential) impact on the yield.2412multivariateregressionleast-squares
Bioreactor yieldsThe percentage yield from a bioreactor given the temperature, impeller speed, duration, and whether or not the reactor has baffles.145multivariatecategoricalregression
Blender efficiencyThe effect of 4 factors on blending efficiency.185multivariatedoe
Brittleness indexA plastic product is produced in three parallel reactors (TK104, TK105, or TK107). For each row in the dataset, we have the same batch of raw material that was split, and fed to the 3 reactors. These values are the brittleness index for the product produced in the reactor.153multivariatemissing-datapaired
Certificates of analysisFour properties of an important powder raw material were transcribed from the supplier's certificates of analysis.1225multivariatemonitoring
Cheddar cheeseConcentrations of acetic acid, H2S, and lactic acid in 30 samples of mature cheddar cheese. A subjective taste value is also provided.304multivariateregression
Class gradesGrades from a Chemical Engineering course at McMaster University.996multivariatemissing-dataregression
Distillation towerSnapshot measurements on 27 variables from a distillation column; measured over 2.5 years.25327multivariateoutliersregression
Film thicknessThe thickness of a plastic film is measured in 4 positions after being cut. The position of the measurements are top right, top left, bottom right and bottom left.1604multivariate
Flotation cellData from a zinc-lead flotation cell measured on 5 variables; recorded from the PLCs.29225multivariatetime-series
Food consumptionThe relative consumption of certain food items in European and Scandinavian countries. The numbers represent the percentage of the population consuming that food type.1620multivariatemissing-data
Food textureTexture measurements of a pastry-type food.505multivariate
Kamyr digesterPulp quality is measured by the lignin content remaining in the pulp: the Kappa number. This data set is used to understand which variables in the process influence the Kappa number, and if it can be predicted accurately enough for an inferential sensor application.30122multivariatemissing-datatime-series
LDPEData from a low-density polyethylene production process. There are 14 process variables and 5 quality variables (last 5 columns).5419multivariateoutliersmonitoring
Oil company DOEExperimental data; testing amount of 4 materials added (A, B, C, D) in order to achieve a certain volumetric heat capacity, y.195multivariatecategoricaldoeregression
PeasThe taste of 27 pea varieties as measured by judges. After blanching the peas, the peas were quick-frozen, packed into bags and stored for 3 months.6017multivariate
Raw material outcomeSix characterizing measurements for batches of plastic pellets; the outcome when using this material, either Poor or Adequate, is also provided.247multivariatecategoricalregression
Raw material propertiesMeasured characteristics of several batches (lots) of plastic pellets.366multivariatemissing-data
Room temperaturesTemperature measurements, in Kelvin, taken from 4 corners of a room.1444multivariate
SawdustSawdust from birch, pine a spruce were blended in specific ratios. The corresponding NIR spectra are recorded.541204multivariate
Silicon wafer thicknessThickness of a single wafer, measured at 9 locations for 184 consecutive batches. A single wafer is removed from a tray of wafers (always at the same position for each batch of wafers) after the chemical vapour decomposition process is complete.1849multivariateoutliers
Six-point board thicknessThickness of 2x6 SPF boards from a saw mill.50006multivariate
SolventsPhysical properties of various chemical solvents, such as melting point, boiling point, dipole moment, refractive index, density, solubility.1039multivariate
Systematic methodData are from an open-ended question on a final exam. The values are the grades achieved for the answer to that question, broken down by whether the student used a systematic method, or not. No grades were given for using a systematic method; grades were awarded only on answering the question.442multivariate
Tablet NIR spectral dataSpectra, measured in the transmittance mode, of 460 pharmaceutical tablets; readings are from 600 to 1898 nm in 2 nm increments.460650multivariate
Travel timesA driver uses an app to track GPS coordinates as he drives to work and back each day. The app collects the location and elevation data. Data for about 200 trips are summarized in this data set.20513multivariatemissing-data
Wine DOEData from a fractional factorial for profiling a new wine. The last 5 columns are the taste values from a panel of judges. Higher values are a better overall taste.1613multivariatedoeregression
Wood fibresA sample of aspen tree fibre as characterized by a fibre quality analyzer (FQA).251656multivariate
