Behind PTE Academic: John de Jong talks tests, tasks, scoring, conspiracies, speaking and fairness with Jay
At E2Language, we’re intent on finding out the truth about English language tests. We don’t like to speculate about how best to prepare for and pass these tests because we understand that they are massive obstacles for you. Instead, we want to find out the facts. We believe that with accurate information and good teaching the experience of taking a test can become a thrilling and validating experience.
Recently Jay had the chance to talk to Professor John H. A. L. De Jong*, the architect behind Pearson’s PTE Academic. John de Jong worked at Pearson from 2006 to 2016 and held the position of Vice President for Test Development and Senior Vice President for Global Standards. During this time he developed the Pearson Test of English Academic (PTE Academic). He is certainly an authority on language testing and perhaps the authority on PTE Academic.
In this conversation, Jay chats with John de Jong about why Pearson created the PTE Academic in the first place, why there are 20 different tasks, how the scoring interrelationships work, whether the system can be ‘gamed’ or not, how the speaking scoring works and why PTE Academic is a ‘fair’ English language test:
Jay: You were one of the principal architects of the PTE-A. Why did the marketplace need another English test with the IELTS and TOEFL already well established?
John: In fact, I was the principal architect of PTE Academic. Both TOEFL and IELTS are somewhat older tests using yesterday’s technology and yesterday’s concepts of language testing. Pearson contracted me to build a new test of English and I wanted to create a real-language test. Traditional item writers construct some text around a language problem. The text is unlikely to occur in real life because it is written to include a language problem. For PTE Academic item writers are required to base their questions on samples of language as they exist. Texts are not written for the test, they are found. Item writers are required to provide the source of their language samples. With the enormous amount of language in books, recordings and on the Internet item writers can find samples of real language, like the test takers, would encounter in their studies. Texts about academic subjects, but also about student life. For example, traditional language tests make use of scripted language for listening items which is then read by actors, producing very unnatural language, without the contractions, elisions, etc. that occur in live language use. Regular language users speak in broken sentences, repeating and circumscribing the essential elements.
Jay: I really like teaching the PTE-A because each task focuses on different English language skills. For example – and correct me if I’m wrong – but Reading: Fill in the Blanks has a focus on ‘collocations’ while Reading and Writing: Fill in the Blanks focuses on ‘word choice’. Can you discuss why you decided to create 20 tasks?
John: I chose to use a large number of different language tasks in order to best represent language as it occurs in real life and to allow to address language ability from a large number of angles thereby ensuring that the multifaceted character of language ability is better represented. The various item types present a range of language aspects. Items can address integrated skills or concentrate on a single skill. This approach also offers students a variety of chances to show their ability to cope with English. They are not dependent on the ability to solve a particular type of language issue.
Jay: Some of the PTE-A tasks are ‘interrelated’. Speaking: Read Aloud, for example, contributes to both your speaking and reading scores while Listening: Summarize Spoken Text contributes points to both your listening and writing scores. Why did you decide to merge these skills?
John: I chose to present so-called integrated items, items addressing more than one skill, to reflect real-life language use. In dealing with the language, users often depend on more than one skill. For example, when asking where the nearest pharmacy is in a foreign city, the language used must be able to formulate the question and to understand the response. In listening to a lecture, the student must be able to follow what is being said and to take notes, i.e., to write.
Jay: Unfortunately, many teachers on the internet focus on ‘tricks’ and ‘gaming the algorithm’ rather than improving their students’ English language skills before taking the PTE-A. Is it possible for someone with low levels of English to ‘trick’ the computer into getting a high score?
John: When in the initial phase I presented some prototypical items to groups of students they said that to do well on these items they could not depend on the traditional test training but needed to really know English and that therefore the best test preparation would be to read newspapers and books and to watch television. Indeed, I am confident that a high score on PTE Academic is more readily achieved by training language use than by seeking to trick the system.
Jay: A lot of our students have particular struggles with their speaking scores. What would be your advice for those candidates trying to improve their speaking scores?
John: My advice would be for them to record any sample of real-life language use (the news on tv, a song by a rapper, a radio discussion program, etc.) and to play it back sentence by sentence. First, to make sure they completely understand and look up any unknown words and in a second or third playback repeat each sentence verbatim, trying to emulate as closely as possible the speed and the intonation of the original. Sometimes do this alone and sometimes with a friend or colleague asking them to point out differences between the original and their own production. This approach will ‘open’ their ears to the English language but also help them to increase their vocabulary.
Jay: Many students have commented that they feel the PTE-A is ‘fairer’ than other English language tests because it is marked by a computer rather than a person. How good is the PTE-A at marking essays and speaking submissions?
John: First, an obvious difference between human and machine marking is that humans differ, and the machine doesn’t. The machine will award the exact same mark to an essay or a spoken response whether it is the first, second, third time it is set to marking these responses. But different persons have different styles, different norms, and even a single person may mark differently at the start of a fresh day and by the evening when they are tired. Secondly, evidence of the quality of machine marking transpires from the fact that machine marking has been shown to correlate better with strictly trained and experienced human markers than with novice markers.
*Professor John H. A. L. De Jong has a Master of Arts in General Linguistics, French, and English Languages from Leiden University and a Ph.D. in Educational Measurement from Twente University. He specialized in the empirical scaling of language proficiency and promotes the development of internationally standardized reporting scales of language proficiency.
In 2000, John de Jong founded his consultancy company LTS (Language Testing Services) to provide training and consultancy services on language education and language testing. He has worked with Ordinate Corporation in California USA, the Australian Council for Educational Research, The Council of Europe, the Dutch Ministry of Education, Culture and Science, the Dutch Ministry of Justice, the World Bank, and the OECD in which he has also been one of the Technical Advisers for Programme for International Student Assessment (PISA) since 1998.
John de Jong has also been involved in the Council of Europe’s projects since 1991 working on defining a common framework for language learning, teaching, and testing. He also worked on establishing the learning load of a foreign language as a function of the first language.