Literacy is a highly complex skill. In today's world it is deemed that all citizens require these skills from an early age. Surveys to measure literacy standards, especially of school children, have been conducted in recent times. One of these, released in September 1997 by the Commonwealth of Australia, reported that approximately 30 per cent of students in grades 3 and 5 failed to meet 'proper' standards of literacy. A detailed analysis was conducted of student texts provided to the public domain from the survey data. This analysis shows that a singular feature consistently distinguishes texts regarded as 'well above standard' from those regarded as 'well below standard' - consistent use of conventional spelling. Qualities that were apparently less valued include following directions, keeping to the topic, maintaining a flow of ideas, taking an authoritative stance, and controlling quite complex grammatical structures. Attention should be given to success in the more demanding aspects of literacy as well as in control of surface features.
Literacy is an ability that human beings developed about 3000 BC to assist record keeping for trading activities in Sumeria (Gelb, 1963). In the interim it has grown into a complex, multifaceted ability encompassing an expanding range of media and demanding an increasing array of communication skills and understanding (Resnick & Resnick, 1997). Today, western society and developing economies have become dependent on all citizens having advanced literacy skills (Lo Bianco & Freebody, 1997).
One of the consequences of this expectation is the felt need of politicians to measure the literacy levels of all - students and adults alike. In the last twenty years in Australia there have been several major surveys to measure literacy performance in adults and children, three of them in September 1997 (Australian Bureau of Statistics, 1997; Bourke & Keeves, 1977; Bourke, Mills, Stanyon & Holzer, 1981; Commonwealth of Australia, 1997a, 1997b; Wickert, 1989).
A complicating factor is the competition over what kind of measures are suitable, valid and reliable. The surveys listed above used different materials and measures thereby making it difficult to make valid comparisons. Queensland, like many other school systems, has monitored students' literacy for over 60 years, using replicated and therefore comparable tests, to show that literacy levels were maintained or increased over that time (Duck, 1979; Jacobson, 1978; Peckman, Fifoot, & Byrne, 1988; Review and Evaluation Directorate, 1993). Since the National and Agreed Goals of Schooling were signed by State Education Ministers in Hobart in 1989, several States and the Federal government have introduced standardised testing on students' basic literacy skills. The results of these are generally reported widely in local and national newspapers.
Today, co-operative efforts are being made to establish literacy benchmarks to be used in schools across all Australian states and territories. As part of this national project, data were collected from 7,454 students in 400 schools across Australia in September 1996.
To help the general public appreciate the nature of the 'literacy problem', Dr Kemp provided writing samples of pupils in Years 3 and 5 for publication in The Australian on Tuesday 16 September 1997. They appeared in the article 'A teaching method fails its children'. These samples, additional to those provided in the published report were used to give the public an opportunity to determine what is meant by children writing 'properly'.
The sample texts were taken from three tasks that were common to the respective Year levels (Commonwealth of Australia, 1997b). These student samples were used to illustrate standards that were deemed to be 'well above' and 'well below' Year 3 and Year 5 standards of writing. These tasks directed students to:
Well above standard (Text A) | Well below standard (Text B) |
I think birds should not be cept in a cage because they need to fly and they need to get some air. And it needs to look around the earth. | Birds shuold be cerd in cajr bcerce they are nisd pers and they hap you go to pars and I like birds very marg. |
Well above standard (Text C) | Well below standard (Text D) |
One day I was walking up a mountain in Port Stephens. I was walking slowly, breathing in the sweet aroma of a purple vilot. It smelt good. I was nearlly at the top of the mountain when suddenly I heard a strange noise. It sounded like a Roc screaching high up in the mountains. | I went on a adventre with a Roc. it flue me & evry one to school & home. it took me on meney advents. I tock him in for nesw. & he skead the tchors out of ther with. Thay were so skead that ther skeltion com out. |
Well above standard (Text E) | Well below standard (Text F) |
Parent's often complain about having to fork out pocket money. But I think the parents should stop and think. If they did't give out pocket money they would be the ones paying for CD's, books, toys, etc. | I think that if you won't to be paid you shuold work for it for igsampal you should mo the lown wosh the car wask up after tea and all uther things. |
First, a traditional approach once used unquestioningly was a simple word count. Using this quantitative technique, Texts A, C and E (30, 53, 37 words) outperform Texts B, D and F (24, 48, 32 words), confirming the survey's results. Another abandoned approach from the United States is the T Unit count, that is, the number of principal clauses in a text (Hunt, 1965). This technique shows Texts A and B to be of equal levels with 3 principal or independent clauses each, Text C with 5 principal clauses to be inferior to Text D with 6 principal clauses, and Text E with 3 principal clauses to be superior to Text F with 1 principal clause. As this technique was abandoned because it ignored the complexity of texts with subordinate and embedded clauses, it would be appropriate to compare the total number of clauses in each pair of texts. On this basis, Text A with 5 clauses is seen to be superior to Text B with 4 clauses; Text C with 6 clauses is inferior to Text D with 7 clauses; and Text E with 5 clauses is inferior to Text F with 6 clauses.
Already it is clear that the choice of methods used to measure literacy performance can influence the results. Therefore statements about literacy standards need to be supported with precise explanations about assessment methods.
In the United Kingdom it was traditional to use global or impression marking (Hartog & Rhodes, 1936; Wiseman, 1961). Such marking, using implicit criteria, has been shown to be influenced heavily by surface features such as handwriting, spelling and punctuation (Charney, 1984; Roach, 1971; Wood, 1991). It would appear from a glance at the texts, therefore, that this method could result in the same outcome as the total word count, making Texts A, C and E superior to Texts B, D and F.
These approaches demonstrate that the most superficial of measures, word count and surface features, may have led to the findings reported in Literacy Standards in Australia. Measures that consider communicative understanding and complexity of thought, both represented through complex language structures, could support this outcome or produce different findings. Further examination is warranted.
An approach currently favoured in the United States, in the United Kingdom, and in Australia, is to assess components of a text individually and then assign an holistic or aggregated assessment by trading off strengths and weaknesses (Diedrich, 1974). Judgments on the components are usually made against predetermined criteria, the method being named for this process - criterion-referenced or standards-referenced assessment (Sadler, 1987). Studies show that this method forces markers to look more closely at aspects of texts (Gipps, 1994). Aspects overlooked in superficial impression marking are taken into account, thus altering the marker's assessment of the quality of a text.
On page 18 of the report (Commonwealth of Australia, 1997a) it states that the texts in the Australian survey were assessed on three features: the quality of thought (including students' abilities to express ideas, to write imaginatively, to develop an argument clearly and logically, and to support a point of view); language control (including spelling, punctuation, and vocabulary); and sense of purpose and audience. Following assessment, each text was determined to have scored above or below a set 'cut score'. It is unclear what marking procedure was employed. It may have been impression marking with the identified features used as a guide, or profiling against all features, or criterion-referenced assessment with markers trading off strengths against weaknesses.
As the relative values of these pairs of sample texts have been pre-determined, it should be possible to analyse the texts into components that are comparable and thus identify the features, or aspects of them, that have been used to distinguish 'above standard' from 'below standard' texts, that is, to determine criteria for the 'cut score'.
Connective | Thing | Event | Thing | Circumstance |
I | think | birds should not be cept in a cage | ||
because | they | need to fly | ||
and | they | need to get | some air. | |
And | it | needs to look | around the earth. |
Connective | Thing | Event | Thing | Circumstance |
Birds | shoudd be cerd | in cajr | ||
bcerce | they | are | nisd pers | |
and | they | hap you go | to pars | |
and | I | like | birds | very marg. |
Connective | Thing | Event | Thing | Circumstance |
I | was walking | One day up a mountain in Port Stephens. | ||
I | was walking | slowly | ||
breathing in the sweet aroma of purple vilot. | ||||
It | smelt | good | ||
I | was | nearly at the top of the mountain | ||
when | I | heard | a strange noise. | suddenly |
It | sounded | like a Roc screaching high up in the mountains. |
Connective | Thing | Event | Thing | Circumstance |
I | went | on a adventre with a Roc. | ||
it | flue | me and evreyone | to school & home. | |
it | tock | me | on meney adventrs. | |
I | took | him | in for nesw. | |
& | he | scead | the tchors | out of ther with. |
Thay | were | so skead that ther skeltion com out. |
Connective | Thing | Event | Thing | Circumstance |
Parent's | often complain about | having to fork out pocket money. | ||
But | I | think | the parents should stop and think. | |
If | they | did not give out | pocket money | |
they | would be | the ones paying for CD's, books, toys, etc. |
Connective | Thing | Event | Thing | Circumstance |
I | think | that you shuold work for it for igsampal you should mo the lown wosh the car wash up after tea and all uther things. | ||
if | you | won't | to be paid |
The two writers show sensitivity to the context although they express their understanding differently. The writer of Text A takes a negative stance, opening the text with a statement of opinion expressed in the subjective first person, I think ... . In contrast, the writer of Text B takes an affirmative position, opening the text with an authoritative assertion expressed in the objective third person, Birds should ... . Their differing perspectives are captured by I think in text A and I like in Text B. The writer of Text A assumes a conservationist role focusing on the needs of birds, while the writer of Text B speaks as a pet lover focusing primarily on his/her own feelings. These personal and social perspectives are presumably intended to catch the sympathy of, or to persuade, readers of the magazine.
While the two texts are not grammatically identical, they parallel each other in clause structures. Each has one independent/principal clause, and three dependent/subordinate clauses. The dependent clauses are linked logically to the independent clause with the same conjunctions in the same sequence - because, and and and, indicating similar patterns of thinking. There is a minor problem with cohesion in Text A in the cohesive chain, birds, they, they, it; and a related difficulty in the generalised birds being kept in a cage. On the other hand, Text A uses an embedded noun clause, (that) birds should not be cept in a cage, as object of think. Both texts indicate attempts to make fine distinctions in meaning in that each text includes five instances of words expressing personal opinion to differentiate them from neutral description; that is, think, should, need to, need to, need to in Text A, and should, nice, (have to), like, very in Text B. It could be argued that some air in Text A refers more accurately to some space. Similarly, the repetition of need to could be debated: is the repetition deliberate for reasons of emphasis, or is it the result of an inability to find synonyms?
With respect to conventions of spelling and punctuation, there are considerable differences between the two texts. Text A of 30 words has 3.3 per cent spelling errors, while Text B of 24 words has 36 per cent spelling errors. The patterns of errors in Text B suggest that the writer could have a hearing impairment that impacts on spelling ability if a phonic orientation is used almost exclusively. The writer of Text A has apparently included the second sentence as an afterthought and thereby given it the erroneous status of a separate sentence, punctuating it as such. Conversely, the final clause of Text B, a personal appraisal, could have been represented in a separate sentence.
In summary, there are no features that distinguish the merits of the two texts in terms of the quality of their ideas. There are differences in language control. Text A shows minor lapses in cohesion, and in discriminating choice of vocabulary. It does, however, show some comparative complexity in incorporating an embedded noun clause. In the use of spelling conventions Text A is far superior. In trading off strengths and weaknesses, the markers appear to have placed high value on conventional spelling and the use of a noun clause to the extent that Text A is awarded 'well above standard' even though it exhibits lapses in cohesion and vocabulary choice. Perhaps the lapses were cancelled out by the use of a noun clause. Text B shows no grammatical lapses. Apart from that, it differs from Text A in two respects: it has no embedded noun clause and it has 33 per cent more spelling errors. One must presume that Text B is well below standard because it has no noun clause and/or it has a large proportion of spelling errors.
Text C establishes a setting, introduces something of a character, I, and mentions the legendary creature, Roc, in a phrase in the last sentence. As there are no interactions between the characters, there is no adventure and so the task directions are not followed properly. There is a narrative sequence of Events, but these events concern only one character, I. Text D is quite different. Characters include I, Roc, everyone (at school and home) and teachers. I initiated actions twice and Roc three times; the teachers reacted emotionally in being scared. Text D follows the task directions fully.
Development of ideas in a narrative adventure generally centre firstly around a range of interactive activities or Events, and then around Circumstances of places in which Events occur. In Text C the Events associated with the one character are limited to was walking, was walking, was, heard. Three of the seven Circumstances develop the mountain setting, the remaining four describing the 'how' of Events. In Text D the Events list the actions of three groups of characters: went, flue, tock, took, scead, were. The five Circumstances identify the various locations of five of these adventurous activities. Only Text D observes the standard conventions of narrative adventures.
The social purpose of narrative adventures is to relate an exciting tale where a problem arises and the character/s find a way to resolve it. Text C has no such problem unless hearing a strange noise can be counted as one. Text D describes types of activities involving the author and his friends, his/her teachers and a Roc. The problem arises when the Roc is taken to school for 'news' and the teachers are scared. The satisfying conclusion for the writer is that the teachers were so sceard that ther skeltion com out.
The contextual sensitivities of Texts C and D are also quite different. Text C is introspective, orientated as if from a dreamy wanderer who seems surprised at being interrupted by a strange noise. The anticipated audience seems to be a similarly reflective reader. Text D evokes a sense of shared fun. It is overtly written from the perspective of a young person of school age and intended for a peer audience who would equally enjoy the spectacle of teachers being scared.
The language control of the two texts is variable. Both texts present coherent ideas in six major clauses, all independent ones except for one dependent clause in Text C. Text D has a minor clause embedded in a postmodifying position, that ther skeltion com out. Text C includes two postmodifying phrases, breathing in the sweet aroma of vilot and screaching high up in the mountains. Both texts use repetition. In Text C was walking is used once to identify where, and the second time to explain how. In Text D took is repeated to highlight a reverse parallelism to show that taking the Roc to class was seen as a 'problem' for that context. Sceard is also repeated to reinforce the planned achievement in scaring the teachers.
Technical conventions are observed in Text C except for two misspellings in vilot, nearlly and screaching, 9 per cent in 53 words. Spelling conventions in Text D are not met in 33 per cent of 48 words, & is used instead of and, and three capital letters are missed at the beginning of marked sentences.
Overall, then, this is an interesting situation where Text C does not meet task specifications but is smooth and technically correct except for two words spelled incorrectly. Text D meets all task and communication specifications, controls complex postmodifying structures, but fails to observe spelling conventions. One can only presume that correctness in surface features is valued over meeting task and communicative expectations.
Ignoring the content, the overall organisation of the text is designed to present a conditional point of view. Text F addresses the topic as given, presenting a conditional point of view that is supported by illustrative examples of work that children could be expected to do.
For this task, no context was provided beyond the directive 'your view', so the writers were forced to imagine a purpose and audience or to assume that the test setters/markers were the only intended audience. As neither text includes Circumstances, it would appear that both writers took the second option. The writer of Text E takes the role of champion of children's rights. In contrast, Text F adopts the role of ethical mediator on the concept of children's responsibilities as members of a community.
Language control in the two texts varies slightly in quality. Each text comprises complete clauses that are linked with appropriate conjunctions. Text E uses a simple set of structures whereas Text F successfully incorporates layers of embedded noun clauses and an interpolated conditional clause within the first noun clause. In addition, the writer of Text F has effectively elided conjunctions, subjects, and a verb to enhance the flow of the text. The writer of Text E has chosen to use colloquial expressions, fork out and give out, when describing the distribution of pocket money by parents. This could be a response to the relatively informal structure of the task.
It is in the area of social consideration through language conventions that greater differences are evident. Spelling errors occur in 5 per cent of the 37 words in Text E, and in 27 per cent of the 32 words in Text F.
In brief, Text E does not address the topic, uses simple clause structures but spells most words correctly. Text F sticks to the topic, uses very complex clause structures but makes many spelling errors.
The relative weightings given to these four general aspects of literacy are self-evident. The first three of these are generally considered more demanding than the fourth, that is, keeping to the topic, being sensitive to the context, and using standard grammatical patterns. In each pair of texts those deemed to be 'below standard' in fact outperform those considered to be 'above standard'. Text A does use complex grammatical structures but Text B outperforms it in other areas of grammar. It is only in the fourth aspect, surface conventions, that the 'above standard' texts are superior - and only in the use of standard spelling.
By analogy, one can reject a house as unacceptable if it is clad in unfashionable materials or colours, but such a personal opinion does not alter the functional viability of the building. It is when the cladding is functionally unsuitable that there is some cause for concern. Greater concern is reserved for buildings that are not built to specifications.
Texts B, D and F are functional because, fundamentally, their writers are communicatively competent, building on some spelling knowledge, including grapho-phonic knowledge - albeit insufficient. What is of greater concern is that the 'above standard' texts are structurally inferior, and in two of the three cases fail to follow directions.
The 'substandard' texts illustrate the heart of literacy in action. Those deemed to be 'above standard' reveal the artificiality of literacy tests such as the one that caused these texts to be produced. They also challenge the validity of the results, given the well-documented circumstance of students writing longer and higher quality texts when they choose or manipulate the topic (Collins, 1993; Fine, 1985; Gradwohl & Schumacher, 1989).
These analyses bring into question the adequacy of the system used to evaluate their quality and, especially, raise questions about the capacity of any system to define literacy in simple terms. If competence in spelling is the goal, then more direct measures can be taken.
Literacy is about communicating effectively. Surely it is important that a text does not simply have more words and look good on the surface, but that it is also responsive to complex demands of the task and the context (Beaugrande, 1984). To assess standards of literacy ability on the quality of surface features alone is akin to appreciating the value of a picture window which boasts a frosted pane.
No one will deny that these young writers who have been deemed to be well below standard need to focus their attention on spelling. But they should be commended for what they can do in responding directly to a task requirement and in controlling complex grammatical structures, not roundly condemned primarily on the basis of poor skills in spelling.
As learners everywhere will realise, too, there is another possible interpretation for the relative differences between these pairs of texts. While concentrating on new skills of grammatical embedding, the writers of the 'substandard' texts would typically demonstrate a temporary poorer performance than usual in other areas such as spelling. These fluctuations occur until complementary skills are synchronised, through practice, to a point where they become virtually automatic. Anybody who has ever learned multifaceted skills, like driving a car, will appreciate the effect. It could be, then that these students are being penalised, and criticised, for attempting to develop their literacy ability!
Issues surrounding literacy can be understood and addressed more effectively if a profile shows the extent to which the identified features are controlled by young people. The general public should have no difficulty in understanding that students may demonstrate different levels of ability across the key areas. Profiles of individuals, the whole group, and sub-groups could show how well students communicated:
Beaugrande, de R. (1984). Text production: Toward a science of composition. Norwood, New Jersey: Ablex.
Bourke, S. F. & Keeves, J. P. (1977). Australian studies in school performance. Volume III. The mastery of literacy and numeracy: Final report. (ERDC Report No. 13). Canberra: Australian Government Printing Service.
Bourke, S. F., Mills, J. M., Stanyon, J. & Holzer, F. (1981). Performance in literacy and numeracy: 1981. Canberra: Australian Government Printing Service.
Charney, D. (1984). The validity of using holistic scoring to evaluate writing: a critical overview. Research in the Teaching of English, 18(1), 65-81.
Collins, P. (1993). Aspects of later language development. Australian Review of Applied Linguistics, 16 (2), 15-26.
Commonwealth of Australia. (1997a). Literacy standards in Australia. Canberra: Commonwealth of Australia.
Commonwealth of Australia (1997b). Mapping literacy achievement: Results of the 1996 National School English Literacy Survey. Canberra: Commonwealth of Australia.
Curriculum Corporation (1994). English: A curriculum profile for Australian schools. Carlton, Victoria: Curriculum Corporation.
Diedrich, P. B. (1974). Measuring growth in English. Urbana, Illinois: National Council for the Teaching of English.
Duck, G. (1979). A comparison of the reading achievement of Queensland Year 5 pupils between 1971 and 1977. Brisbane: Research Branch, Department of Education, Queensland.
Fine, J. (1985). What do surface markers mean? Towards a triangulation of social, cognitive and linguistic factors. In J. D. Benson & S. Greaves (Eds), Systemic perspectives on discourse (Vol. 2) (pp. 102-115). Norwood, New Jersey: Ablex.
Gelb, I.J. (1963). A study of writing (2nd ed.). Toronto, Canada: Toronto Press.
Gipps, C. V. (1994). Beyond testing: Towards a theory of educational assessment. London: Falmer Press.
Gradwohl, J. M. & Schumacher, G. M. (1989). The relationship between content knowledge and topic choice in writing. Written Communication, 6(2), 181-195.
Halliday, M. A. K. (1985). An introduction to functional grammar. London: Edward Arnold.
Hartog, P. & Rhodes, E. C. (1936). An examination of examinations (2nd ed.). London: Macmillan & Co.
Hunt, K. (1965). A synopsis of clause-to-sentence-length factors. English Journal, 54, 305-309.
Jacobson, J. (1978). A summary of the research into reading standards of Queensland grade five pupils 1933-1977. Brisbane: Queensland Institute for Educational Research.
Lo Bianco, J. & Freebody, P. (1997). Australian literacies: Informing national policy on literacy education. Melbourne: Language Australia.
Peckman, G., Fifoot, C., & Byrne, M. (1988). A comparison of the reading achievement of Queensland Year 5 pupils between 1981 and 1986. Brisbane: Department of Education, Queensland.
Resnick, D. & Resnick, L. (1997). The nature of literacy: An historic exploration. Harvard Educational Review, 47, 263-291.
Review and Evaluation Directorate (1993). Assessment of student performance 1992. Aspects of reading and writing: Overall results. Queensland: Review and Evaluation Directorate, Department of Education.
Roach, J. (1971). Public examinations in England 1850-1900. Cambridge: Cambridge University Press.
Sadler, D. R. (1987). Specifying and promulgating achievement standards. Oxford Review of Education, 13(2), 191-209.
Wickert, R. (1989). No single measure: A survey of Australian adult literacy. Canberra: Commonwealth Department of Employment, Education and Training.
Wiseman, S. (1961). The efficiency of examinations. In S. Wiseman (Ed.), Examinations and English education (pp. 133-164). Manchester: Manchester University Press.
Wood, R. (1991). Assessment and testing: A survey of research commissioned by the University of Cambridge Local Examinations Syndicate. Cambridge & New York: Cambridge University Press.
Author details: Lenore Ferguson teaches English curriculum to preservice secondary teachers. In addition to being passionate about the creative aspects of English curriculum, she is interested in the teaching and learning of literacy at all age levels. The analytic approach used in this paper was developed as part of her current doctoral studies.
Please cite as: Ferguson, L. (1997). Assessing literacy: Beyond the frosted pane. Queensland Journal of Educational Research, 13(1), 71-90. http://education.curtin.edu.au/iier/qjer/qjer13/ferguson.html |