|
|
|
 |
Experimental Design |
|
|
There are a number of factors which will have an impact on the results from any experiment and should be considered as in integral part of it's design and execution.
|
 |
 |
 |
Clear experimental objectives and design |
The desired outcome of the experiment and the hypotheses to be tested need to be clearly identified prior to the experiment. Once this has been done the experiment can be designed to minimise variation due to unwanted effects in the results. For example this could mean:
- Identifying fixed controller teams for the experiment
Each team would then control each of the trial scenarios, so the effect of differences between controller teams could be clearly identified and hence separated from the effects of the scenario variables.
- Trials schedules
Measured trials can be scheduled such that different controller teams play the different scenarios in different orders. This separates any residual effect from controllers’ growing familiarity with the system from the effects of the scenario variables.
|
 |
 |
Realistic traffic and airspace environment |
When simulations are run it is not possible to model the entire airspace. Normally a small number of sectors are fully simulated and the results are collected from these sectors. The surrounding sectors are then normally modelled as feeder sectors which aim to provide a realistic interface to the measured sector.
So that any measurements made are valid it is important that the traffic used for the simulation is realistic for the airspace modelled. That is it displays the characteristics of traffic flown in the sector, is grown (if appropriate) in accordance with forecasts and fits with the flows in the greater airspace environment. Artificial traffic flows can lead to unworkable tools or procedures appearing to work or suggest unrealistically high benefits arising from the new tools or procedures.
|
 |
 |
Disciplined execution of the experiment |
A trial needs to measure the Capacity, Safety and Efficiency arising from the Operational Concepts as they have been defined. (The results may of course suggest that improvements to the Operational Concepts are needed.)
To achieve this, two elements are necessary. Firstly the Operational Concepts and corresponding operating procedures must be sufficiently mature to be well specified and workable.
Secondly, it is essential that participating controllers operate in the simulation according to the defined operational concept and operating procedures, even if they feel this leads to less-than-optimum control of the traffic. This must be addressed during training.
It would be possible to use the INTEGRA process in a less stable environment, i.e. where procedures are still being developed or the HMI modified, but the measurements achieved would only be comparative.
|
 |
 |
A mature experimental environment |
It is important that the experimental infrastructure is also mature and reliable. For meaningful data to be collected simulation runs must be completed without equipment or software failure. Such problems lead to loss of results and also loss of confidence in the experiment.
To ensure that the maximum amount of data can be collected from planned simulation runs it is important that the simulation environment and recording facilities can be relied on to perform correctly for the required number of runs and for the duration of runs. The ability to restart quickly is also important if valuable simulation time and resources are not to be wasted.
The maturity of the system will also allow all participants to focus on the execution of the experiment.
|
 |
 |
 |
Under CARE/INTEGRA a contract was carried out aimed at investigating the ways of collecting and interpreting quantitative data for the workload of a human actor performing tasks in simulations from fields other than ATM. The aim was to determine whether there are any lessons to be learnt from these areas that can be exploited in the ATM simulation environment.
It should be emphasised that this was not a study of the role of humans in an advanced system, nor of the acceptabilty of the system, but rather it addressed how quantitatively to measure operator workload when performing
the tasks required by the simulation, and how to relate this workload to specific events and times in the simulation.
The objectives were:
 |
- To review simulations of systems where humans are involved in an active way in the operation of the system or process; i.e. they are involved in assimilating data and interacting with the system in response to the data, not just monitoring the system
- To identify and critically review the mechanisms for measuring and collecting quantitative data for the workload of humans in these simulations, addressing: the association of the workload with a specific activity and a specific time in the simulation; the reconciliation of subjective and objective data; and the correspondence of the activity performed with the procedures designed for the system, i.e., is the system being operated in accordance with the design of the system and therefore is the workload measured necessary for the operation of the system?
- To produce guidelines for the design of simulations, and the collection, interpretation and analysis of these data
- To apply the guidelines to design an experiment for a given ATM concept.
|
 |
 |
The reports that were produced from this activity can be accessed below.
 |
| Review of Non-ATM Human-in-the-Loop Simulations
|
|
This document identifies and reviews current human-in-the-loop simulations from fields other than air traffic management (ATM). In the systems selected, the human plays an active role in the simulated system or process, i.e. the system or process is dependent on the human performing an activity, or series of activities, for its correct action. Case studies are drawn from military and non-military human in the loop (HITL) research.
|
|
|
|
| Review of Workload Measurement, Analysis and Interpretation Methods
|
|
This report was prepared as part of a project being conducted under the EUROCONTROL INTEGRA programme. The aim of the project is to derive principles of workload measurement in man-in-the-loop simulations from experience in non-air traffic management (ATM) domains. This report describes the outcome of Work Package 2. Types of workload measure — performance based, subjective, and physiological/biochemical — are critically reviewed, and advice is given on methods of selecting the best set of measures for ATM simulations. In the next Work Package, the development of sound experimental designs incorporating these measures will be considered.
|
|
|
|
| Experimental Design and Analysis Techniques for Human-in-the-Loop Simulation Evaluations
|
|
The fundamental psychometric criteria for measures are discussed. Also considered are the types of factor
employed in experimental design, and important issues such as confounding. Specific topics addressed include the relative merits of within-subjects and between-subjects designs; calculation of statistical power to determine the correct number of participants; dealing with common data problems such as missing values and outliers (deviant scores); use of parametric versus non-parametric statistical tests; and minimising the effects of prior learning when comparing an existing system to a new concept. A flow chart is presented to assist in the selection of the most appropriate test for particular applications. Aspects of the implementation, analysis and interpretation of candidate workload measures are also described, and the possible role of modelling is discussed.
|
|
|
 |
 |
|
| Measuring Workload in Man-in-the-Loop Simulations: Worked Example
|
|
Here we apply these principles to a worked example, in the form of the specification of a trial to compare the workload experienced using conventional and PHARE Demonstration 1 (PD/1) ATM systems. The issue to be
investigated is whether the PD/1 system will allow controllers to cope with more traffic for the same level of workload. To demonstrate this increase in capacity, it is necessary to obtain evidence that the PD/1 system produces lower workload than the conventional system. The experimental design is defined in detail, and guidance for analysis, interpretation and reporting of the results is provided.
|
|
|
|
 |
|
 |
|
| |
 |
 |
|
Last validation: 03/10/2005
|
|
|
|
 |
|
|