[Graham Maxwell was a consultant to the Working Party on Tertiary Entrance and continues to act as an honorary consultant to the Board of Senior Secondary School Studies. The views expressed in this paper are personal and should not be taken to constitute official opinions of the Board of Senior Secondary School Studies.]
"Comments on Tertiary Entrance in Queensland: A Review" (McGaw, 1989) represents an overwhelming endorsement of the overall thrust and most of the details of the report Tertiary Entrance in Queensland: A Review (Working Party on Tertiary Entrance, 1987). This endorsement by one of the most respected authorities on tertiary entrance in Australia should not be taken lightly. Tertiary entrance procedures involve matters of great complexity and subtlety. It is clear that the Working Party exercised considerable care and diligence in identifying the key issues, exploring alternative policies and procedures, and arriving at sensible recommendations. It is to be hoped that further deliberations about future arrangements concerning the interface between secondary schools and tertiary institutions will take as their starting point the analyses and recommendations of this report.
This paper offers some reflections on McGaw's comments. For the most part these reflections will reinforce or elaborate those comments. In a few instances I will take issue with McGaw's interpretation and proposed modification of the Working Party's recommendations, though it should be clear that we are substantially in agreement on the overall direction of change and differ on matters of detail. I would prefer that the recommendations be implemented with McGaw's suggested modifications than not be implemented at all.
The diversity of Senior options and choice is actually greater than represented in Table One of the Working Party's report (Working Party on Tertiary Entrance, 1987, p.131). This is so in two ways: on the one hand, the eligibility rules for the TE Score require a minimum of only three subjects to be taken for a full four semesters with the possibility of taking each of the remaining eight units in a variety of subjects; on the other hand, students can go beyond the minimum requirement (of 20 units) by taking subjects other than Board Subjects, that is, Board Registered Subjects (previously, at the time of the report, called Board Registered School Subjects), School Subjects, or TAFE Subjects. The recommendation of the report for as few as three subjects to count for the tertiary entrance profile would allow even greater diversity of choice and was clearly intended to encourage students to take more non-Board subjects. This would have special advantages for the TAFE sector.
It is often suggested that the exclusion of non-Board subjects from the TE Score is unfair. There is nothing in principle to prevent the inclusion of Board Registered, School and TAFE subjects provided that certain criteria are met to ensure that the coherence of the system is maintained. The minimum requirements for inclusion of a subject would seem to be that the subject should involve an underlying component of intellectual skills including skills of analysis, reasoning, reading and writing such as are typical of other subjects included in the scaling, and that the subject should be submitted to the full accreditation and certification requirements of the Board (including monitoring and review procedures). There would, of course, be resource implications in any expansion of subjects beyond the current list of Board subjects and these additional resources would have to be assured.
It must be realised, however, that any selection decision necessarily involves the placing of multidimensional information onto a single dimension (even if there are only two points on it, as there are eventually: either 1 = select or 0 = reject). There can be no question about the necessity, only about what information is included and how it is combined. Furthermore, though this is a different issue, there are always people who just miss out (even by a whisker) and who might have replaced those who just squeaked in if the type of information, the circumstances under which it was obtained, and the way it was aggregated had been slightly different.
The Working Party obviously considered the possibility of reporting simply a subject performance profile, leaving the problem of how to solve the aggregation problem, and therefore the multidimensionality problem, to those responsible for the selection decisions. Where such a system has been tried, notably New South Wales, the tertiary institutions have not in general sought to use the available information in more differentiated ways but have continued to calculate a general performance index (HSC Score). It is a clear case of how pressures of time, cost and feasibility play a big role in tertiary selection procedures. The system has become less, rather than more, sensitive to local circumstances which can be dealt with through anomaly identification and appeals procedures by a separate certifying authority such as the Queensland Board of Senior Secondary School Studies.
Further to the provision of a profile of subject results, the need for comparability of results within a subject across schools would require at least a partial return to public examinations (subject reference tests for scaling school-based subject assessments). An omnibus scaling test such as presently used can provide a suitable basis for scaling across subjects within each school (and then across schools) but would be completely inadequate for scaling each subject across schools. Currently, the adjusted Special Subject Assessments (SSAs) resulting from the first stage of scaling provide estimates of each student's general performance within the school not estimates of each student's subject performance across schools. This point is discussed in greater detail in Appendix 1 of the Working Party's report.
The Working Party's recommendation on subscales, as McGaw recognises, is an entirely different proposal aimed at providing additional information about student performance but without a sectioning of subjects. McGaw is quite right in claiming that there is no psychometric justification for the nesting of subscales within the Overall Achievement Positions (OAPs) and in recognising that this was a matter of "deliberate policy". The Working Party's concern was that one of the subscales (probably the "symbolising" dimension) might become the dominant scale, overriding even the OAP in importance and introducing unfortunate "backwash effects" on student choices of subjects. McGaw's analysis that such effects would occur anyway appears unassailable. Consequently, I would not wish to persist in arguing for the merit of nesting. Even so, it is important to draw attention to the possibility of untoward effects in the use of profiles in selection procedures and the need for widespread discussion before any such procedures are put in place.
The more fundamental issue, which informed the comments of Maxwell and Allen (1988) about "global" versus "regional" to which McGaw refers is the matter of banding and levels of precision in reporting the scaled results. This needs further discussion (see later).
Explanations of the present scaling procedures for TE Scores have been provided elsewhere (Maxwell, 1987; Maxwell, 1988). The general principle is to make sure that each student's TE Score (or OAP) is as independent as possible of which subjects that student chooses to take and of who else chooses to take those subjects. Paradoxically, of course, this requires that the general ability of students in each subject and school be taken into consideration. The scaling procedures can be seen as directed at removing those components of each student's scores that are arbitrarily related to the company they have kept (that is, the performance of other students in those subjects and that school). The Board of Senior Secondary School Studies may unwittingly contribute to popular misconceptions on this matter by publishing the means and standard deviations on ASAT for statewide subject groups. Such statistics are strictly meaningless in terms of the scaling procedures. What is taken into consideration is the distribution of ability of the group of students taking each subject within each school, which can vary from school to school and year to year (even though the statewide data are remarkably stable). The first stage of scaling can also be thought of (but not so accurately) as estimating what the subject results would have been like if everyone in the school had taken that subject.
Recent ongoing analyses show that the current scaling procedures actually work remarkably well. One analysis has shown generally strong relationships within schools between student average levels of achievement (AVLA) and rescaled aggregates (RAGs). Departures from strong relationships are now identified through a set of anomaly detection procedures and referred for special consideration to an appeals committee. The anomaly detection procedures also involve other tests of the possible lack of fit of the scaling model in particular situations but the total number of such cases is currently quite small.
Two points about anomaly detection procedures should be emphasised. One point is that this is simply part of the ongoing exploration of ways in which the intentions of the scaling system can be better realised. Over the years various procedural and calculational changes have been introduced, both minor and substantial, as new understandings have been reached about how the process can be improved. The scaling system is known to work better now than it did in the past. The second point is that further improvements, both in anomaly detection and in other technical matters, can almost certainly be effected and could be achieved more rapidly if additional resources could be found for the necessary research.
Lately I have been involved with several other people in analysing the characteristics of the scaling system through a process which we have termed 'perturbation analysis'. The essence of the process is to perturb the data a bit and see what happens. The central proposition is that a reasonable level of output precision would be one where on the one hand a small change in the input data produces little or no change in the output data and on the other hand a large change in the input data produces a noticeable change in the output data. It is necessary, of course, to operationalise the meaning of 'small' and 'large' in these contexts. A complete explanation must await a full report on these studies. A general overview must suffice here.
A variety of small discrete changes in the input data have been investigated. These have included arbitrary changes to some of the data at the level of the minimum change possible (one point on the input scales), the introduction of a twin with identical results, and the removal of the top or bottom 5 per cent of a school. What is examined is the effect on the other students. As the number of bands increases from 20 (proposed OAPs) to 100 (present TE Scores) to 1000 (RAGs), the effects become more substantial. That is, the number and size of changes of classification become larger. On the other hand, such perturbations produce for 20 bands what would generally be considered to be a reasonable number of changes by one band and essentially no changes by more than one band. The consequences for many students' TE Scores of such marginal changes in the input data relating to other students is quite alarmingly large and the RAGs are even more wildly unstable. There is considerable noise in the scaling system but the 'signal to noise' ratio would seem to be recognisably appropriate at about 20 equal-size bands. (The second requirement, of noticeable change in output for a large change in input, such as a change for a student by two levels of achievement in half their subjects, is also satisfied by 20 bands.)
Three further points need to be made. First, the use of a more fine-grained banding, as with TE Scores, may itself be a source of much of the present public dissatisfaction with the present system; parents and students are well aware, even if measurement experts are not, that the present system produces unstable results; much of this dissatisfaction could be expected to dissipate if the results were reported to a reasonable degree of precision, one that could be seen to actualise the aim of each student's final result being unaffected by the vagaries of the performance of other students in the same subject and school. Second, contrary to the claims of both McGaw ( 1989) and Sadler ( 1987), neither of whom have analysed real data, the perturbation analyses show that the instability at the top end is not less but more than that in the middle and that equal-size bands are more reasonable than unequal-size bands at all levels of overall achievement. Third, there seems no justification for going beyond the tolerance of the data for any purpose; we would not do so with scientific data; clearly, TE Scores, and a fortiori RAGs, are beyond the level of meaningful tolerance in the data and bring the system into disrepute when used for selection decisions.
Furthermore, it must be remembered that standards for the levels of achievement need to be defined so as to represent an anticipated range of student performance. As such they cannot escape their underlying normative base; no useful set of standards can. SSAs require finer discrimination within levels of achievement. Such finer discriminations are in any case necessary as part of the system of monitoring and review.
The real conflict would appear to occur when the adjusted SSAs are interpreted erroneously as indicating comparative performance within the subject across all schools (rather than comparison across all subjects within a school). For this reason, it would appear preferable to require all SSAs to be reported to students and parents on a standard scale (say, mean and standard deviation of 62 and 12 respectively). Some schools have adopted the practice of prestandardising each set of SSAs to the group mean and standard deviation on ASAT, leading to interpretive confusion
Allen, J.R. (1988). ASAT and TE Scores: A Focus on Gender Differences. Brisbane: Board of Secondary School Studies.
Andrich, D. (1989). Upper-Secondary Certification and Tertiary Entrance: Review of Upper-Secondary Certification and Tertiary Entrance Procedures commissioned by the Minister for Education in Western Australia. Perth: (mimeo).
Committee for the Review of Tertiary Entrance Score Calculations in the Australian Capital Territory. (1986). Making admission to higher education fairer. (Chair: Dr Barry McGaw). Canberra: Australian Capital Territory Authority, Australian National University, Canberra College of Advanced Education.
Daley, D.J. (1989). Determining Relative Academic Achievement for Fair Admission to Higher Education. (Report to a Joint Committee of the Australian National University, the Canberra College of Advanced Education, and the ACT Schools Authority appointed to supervise research into Tertiary Entrance Score calculations). Canberra: (mimeo).
Masters, G.N. & Beswick, D.G. (1986). The construction of tertiary entrance scores: Principles and issues. Melbourne: Centre for the Study of Higher Education, University of Melbourne.
Maxwell, G.S. (1987). Scaling school-based assessments for calculating overall achievement positions. Appendix 1 in Working Party on Tertiary Entrance. Tertiary entrance in Queensland: A review. (Chair: Mr John Pitman). Brisbane: Joint Advisory Committee on Post-Secondary Education and Board of Secondary School Studies (pp. 190-200). [Also in The Tertiary Entrance Score - A Technical Handbook of Procedures. Brisbane: Board of (Senior) Secondary School Studies, 1988 (pp. 44- 52).]
Maxwell, G.S. (1988). The how and why of TE Scores. The Graduate Connection, Queensland edition, September 3-16 and 25-26.
Maxwell, G.S. & Allen, J.R. (1987). A rejoinder to the paper by D. R. Sadler: "An analysis of certain proposals contained in 'Tertiary Entrance in Queensland: A Review..." Brisbane: Board of Secondary School Studies.
McGaw, B. (1989). Comments on Tertiary Entrance in Queensland: A Review. Queensland Researcher, Vol.5, No.2, pp.25-44. http://www.iier.org.au/qjer/qr5/mcgaw.html
Ministerial Review of Post-Compulsory Schooling. (1985). Report Volume 1. (Chair: Ms Jean Blackburn). Melbourne: Ministerial Review of Post-Compulsory Schooling.
Ministerial Working Party on School Certification and Tertiary Admissions Procedures. (1984). Assessment in the upper secondary school in Western Australia. (Chair: Dr Barry McGaw). Perth: Western Australian Government Printer.
Sadler, D.R. (1987). An Analysis of certain proposals contained in Tertiary Entrance in Queensland: A Review with particular reference to the achievement position profile and step-wise selection. St Lucia: Assessment and Evaluation Research Unit, Department of Education, The University of Queensland.
Williams, T. (1987). Participation in education (ACER Research Monograph No. 30). Hawthorn, Vic.: Australian Council for Educational Research.
Working Party on Tertiary Entrance. (1987). Tertiary entrance in Queensland: A review. (Chair: Mr John Pitman). Brisbane: Joint Advisory Committee on Post-Secondary Education and Board of Secondary School Studies.
Please cite as: Maxwell, G. (1989). Reflections on "Comments on Tertiary Entrance in Queensland: A Review". Queensland Researcher, 5(1), 45-60. http://www.iier.org.au/qjer/qr5/maxwell.html |