A study on the reliability of the final achievement computer-Based mcqs test 1 for the 4th semester non - english majors at hanoi university of business and technology

Testing plays a very important role in teaching and learning process. Testing is one form of measurement which is used to point out strengths and weaknesses in the learned abilities of the students. Through testing, especially tests scores we may discover the performance of given students and of teachers. As far as students are concerned, test scores reveal what they have achieved after a learning period. As for teachers, test scores indicate what they have taught to their students. Based on test results, we may make improvement in teaching, learning and testing for better instructional effectiveness. Another reason for the selection of testing a matter of study lies in the fact that the current language testing at Hanoi University of Business and Technology (HUBT) has been under a lot of controversy among students and teachers. Testing is mainly carried out in the form of two objective tests on computers (named test 1 and test 2) which are administered at the end of each semester. The scores that a student gets on these tests are the main indicators of his or her performance during the whole semester. There are different comments on the results of these tests, especially the test 1 for the second-year non-English majors. Some subject teachers claim that these tests do not truly reflect the students’ language competence. Others say that these tests are appropriate to what students have learnt in class and compatible with the course objectives and therefore reliable. Also, among the students, do opposite ideas exist. Many think that these tests are more difficult than what they have learnt and studied for the exam, others say that these test items are easy and relevant to what they have been taught. Therefore finding out whether the tests are closely related with what the students have been learnt and what the teachers have taught, also, whether these tests are of reliability is indispensable.

73 trang | Chia sẻ: superlens | Lượt xem: 1966 | Lượt tải: 5

Bạn đang xem trước 20 trang tài liệu A study on the reliability of the final achievement computer-Based mcqs test 1 for the 4th semester non - english majors at hanoi university of business and technology, để xem tài liệu hoàn chỉnh bạn click vào nút DOWNLOAD ở trên

VIETNAM NATIONAL UNIVERSITY, HANOI COLLEGE OF FOREIGN LANGUAGES DEPARTMENT OF POSTGRADUATE STUDIES NGUYEN THI VIET HA A STUDY ON THE RELIABILITY OF THE FINAL ACHIEVEMENT COMPUTER-BASED MCQS TEST 1 FOR THE 4TH SEMESTER NON - ENGLISH MAJORS AT HANOI UNIVERSITY OF BUSINESS AND TECHNOLOGY (đánh giá độ tin cậy của bài thi trắc nghiệm THứ NHấT TRÊN MáY TíNH cuối kỳ 4 dành cho sinh viên năm thứ hai không chuyên ngành tiếng anh trờng đại học kinh doanh và công nghệ hà nội) Minor Programme Thesis Field: Methodology Code: 601410 HANOI, 2008 VIETNAM NATIONAL UNIVERSITY, HANOI COLLEGE OF FOREIGN LANGUAGES DEPARTMENT OF POSTGRADUATE STUDIES NGUYễN THị VIệT Hà A STUDY ON THE RELIABILITY OF THE FINAL ACHIEVEMENT COMPUTER-BASED MCQS TEST 1 FOR THE 4TH SEMESTER NON - ENGLISH MAJORS AT HANOI UNIVERSITY OF BUSINESS AND TECHNOLOGY (đánh giá độ tin cậy của bài thi trắc nghiệm THứ NHấT TrÊN MáY TíNH cuối kỳ 4 dành cho sinh viên năm thứ hai không chuyên ngành tiếng anh trờng đại học kinh doanh và công nghệ hà nội) Minor Programme Thesis Field: Methodology Code: 601410 Supervisor: Nguyễn Thu Hiền. M.A HANOI, 2008 VIETNAM NATIONAL UNIVERSITY, HANOI COLLEGE OF FOREIGN LANGUAGES DEPARTMENT OF POSTGRADUATE STUDIES CANDIDATE’S STATEMENT I hereby state that I: Nguyen Thi Viet Ha, Class 14A, being a candidate for the degree of Master of Arts (TEFL) accept the requirements of the College relating to the retention and use of Master of Arts Thesis deposited in the library. In terms of these conditions, I agree that the origin of my thesis deposited in the library should be accessible for the purposes of study and research, in accordance with the normal conditions established by the librarian for the care, loan or reproduction of the thesis. Signature Date ACKNOWLEDGMENTS In the completion of this thesis, I have received a great deal of backup. Of primary importance has been the role of my supervisor, Ms. Nguyen Thu Hien, M.A, Teacher of Department of English and American Languages & Cultures, College of Foreign Language, Vietnam National University, Hanoi. I am deeply grateful to her for her precious guidance, enthusiastic encouragement and invaluable critical feedback. Without her dedicated support and correction, this thesis could not have been completed. I am deeply indebted to my dear teacher, Mr. Vu Van Phuc, M.A, Head of Testing Center, College of Foreign Languages, VNU, who provided me with a lot of useful suggestion and assistance towards my study. I would also like to express my sincere thanks to all teachers and colleagues in English Department, HUBT, for their help in conducting the survey, sharing opinions and making suggestions to the study. Especially, my thanks go to Ms. Le Thi Kieu Oanh, Assistant of English Department, HUBT for her willingness to offer test score data. I wish to show my special thanks to the students of K11 at Hanoi University of Business and Technology who have actively participated in the survey.. Finally, it is my great pleasure to acknowledge my gratitude to beloved members of my family, especially my husband who constantly encouraged and helped me with my thesis. ABSTRACT The main aim of this minor thesis is to evaluate the reliability of the final Achievement Computer-based MCQs Test 1 for the 4th semester non-English majors at Hanoi University of Business and Technology. In order to achieve this aim, a combination of both qualitative and quantitative research methods were adopted. The findings indicate that there is a certain degree of unreliability in the final achievement computer-based MCQs test1 and there are two main factors that cause the unreliability including test item quality and test- takers’ performance. Having carefully considered a thorough analysis of the collected data, the author made some suggestions in order to improve the quality of the final achievement test and the MCQs test 1 for the non-majors of English in the 4th semester in Hanoi University of Business and Technology. Firstly, the test objectives, sections and skill weight should be adjusted to be more compatible with the course objectives and the syllabus. Secondly, a testing committee should be set up for the construction and development of a multi choice item bank including test items which are of good p-value and discrimination value. LIST OF ABBRIVIATIONS 1. CBT: Computer-based testing 2. HUBT: Hanoi University of Business and Technology 3. MC: Multi choice 4. MCQs: Multi choice questions 5. ML Pre- : Market Leader Pre-intermediate 6. KD: Kuder- Richardson 7. SD: Standard deviation LIST OF TABLES AND CHARTS 1. Table 1 Types of tests 2. Table 2 Scoring format for each semester 3. Table 3 The syllabus for 4th semester (for non –English majors) 4. Table 4 Time allocation for language skills and sections 5. Table 5 Specification grid for the final computer-based MCQs test 1 6. Table 6 Main points in the grammar section 7. Table 7 Main points in the vocabulary section 8. Table 8 Topics in reading section 9. Table 9 Items in the functional language sections 10. Table 10: Test reliability coefficient 10. Table 11: p-value of items in 4 sections 11. Table 12: Discrimination value of items in 4 sections 12. Table 13: Number of test items with acceptable p-value and discrimination value in 4 sections 13. Table 14: Suggested scoring format 14. Table 15: Proposed test specifications 12. Chart 1 Students’ response on test content 13. Chart 2 Students’ response on item discrimination value 14. Chart 3 Students’ response on time length 15. Chart 4 Students’ response arbitrariness 16. Chart 5 Students’ response on relation between test score and their achievement TABLE OF CONTENT CANDIDATE’S STATEMENT i ACKNOWLEDGEMENT ii ABSTRACT iii LIST OF ABBREVIATION iv LIST OF TABLES AND CHARTS v TABLE OF CONTENT vi Chapter 1: INTRODUCTION 1 1.1. Rationale for the study 1 1.2. Aims and research questions 2 1.3. Theoretical and practical significance of the study 2 1.4. Scope of the study 2 1.5. Method of the study 2 1.6. Organization of the paper 3 Chapter 2: LITERATURE REVIEW 4 2.1. Language testing 4 2.1.1. What is a language test? 4 2.1.2. The purposes of language tests 4 2.1.3. Types of language tests 5 2.1.4. Criteria of a good language test 5 2.2. Achievement test 6 2.2.1. Definition 6 2.2.2. Types of achievement test 6 2.2.3. Considerations in final achievement test construction 7 2.3. MCQs test 7 2.3.1. Definition 7 2.3.2. Benefits of MCQs test 8 2.3.3. Limitations of MCQs test 10 2.3.4. Principles on designing a good MCQs test 11 2.4. Reliability of a test 11 2.4.1. Definition 11 2.4.2. Methods for test reliability estimate 12 2.4.3. Measures to improve test reliability 15 2.5. Summary 15 Chapter 3: The Context of the Study 16 3.1. The current English learning, teaching and testing situation at HUBT 16 3.2. The course objectives, syllabus and materials used for the second non-majors of English in Semester 4. 17 3.2.1. The course objectives 17 3.2.2. Business English syllabus 17 3.2.3. The course book 19 3.2.4. Specification grid for the final achievement Computer-based MCQs test in Semester 4. 19 Chapter 4: Methodology 21 4.1. Participants 21 4.2. Data collection instruments 21 4.3. Data collection procedure 21 4.4. Data analysis procedure 22 Chapter 5: RESULTS AND DISCUSSIONS 23 5.1. The compatibility of the objectives, content and skill weight format of the final achievement computer-based MCQ test 1 for 4th semester with the course objectives and the syllabus 23 5.1.1 The test objectives and the course objectives 23 5.1.2. The test item content in four sections and the syllabus content 24 5.1.3. The skill weight format in the test and the syllabus 26 5.2. The reliability of the final achievement test 27 5.2.1. Reliability coefficient 27 5.2.2. Item difficulty and discrimination value 27 5.3. The attitude of students towards the MCQs test 1 29 5.4. Pedagogical implications and suggestions on improvements of the existing final achievement computer-based MCQs test 1 for the non-English majors at HUBT. 34 5.5. Summary 38 Chapter 6: CONCLUSION 39 6.1. Summary of the findings 39 6.2. Limitations of the study 40 6.3. Suggestions for further study 40 REFERENCES 41 APPENDICES I APPENDIX 1 Grammar, Reading, Vocabulary and Functional language check list II APPENDIX 2 Survey questionnaire (for students at HUBT) IV APPENDIX 3 Students’ test scores VII APPENDIX 4 Item analysis of the final achievement computer-based MCQs test 1- 150 items, 349 examinees XII APPENDIX 5 Item indices of the final achievement computer-based MCQs test 1 XVII Chapter 1: Introduction Rationale of the study Testing plays a very important role in teaching and learning process. Testing is one form of measurement which is used to point out strengths and weaknesses in the learned abilities of the students. Through testing, especially tests scores we may discover the performance of given students and of teachers. As far as students are concerned, test scores reveal what they have achieved after a learning period. As for teachers, test scores indicate what they have taught to their students. Based on test results, we may make improvement in teaching, learning and testing for better instructional effectiveness. Another reason for the selection of testing a matter of study lies in the fact that the current language testing at Hanoi University of Business and Technology (HUBT) has been under a lot of controversy among students and teachers. Testing is mainly carried out in the form of two objective tests on computers (named test 1 and test 2) which are administered at the end of each semester. The scores that a student gets on these tests are the main indicators of his or her performance during the whole semester. There are different comments on the results of these tests, especially the test 1 for the second-year non-English majors. Some subject teachers claim that these tests do not truly reflect the students’ language competence. Others say that these tests are appropriate to what students have learnt in class and compatible with the course objectives and therefore reliable. Also, among the students, do opposite ideas exist. Many think that these tests are more difficult than what they have learnt and studied for the exam, others say that these test items are easy and relevant to what they have been taught. Therefore finding out whether the tests are closely related with what the students have been learnt and what the teachers have taught, also, whether these tests are of reliability is indispensable. For the two reasons mentioned above, the author would like to undertake this study entitled “A study on the reliability of the final achievement Computer-based MCQs Test 1 for the 4th semester non-English majors at Hanoi University of Business and Technology” with the intention to examine rumors about this test. In addition, the author hopes that the study results help to raise awareness among teachers as well as those who are interested in this field. At the same time, study results, in some extent, can be applied to improve the current testing situation in HUBT. Aims and research questions The main aim of the study is to investigate the reliability of the existing final achievement MCQs test 1 (4th semester) for non-English majors at HUBT through analyzing the test objectives, test content and test skill weight format, students’ scores, test items, perception and comments from students on the test and then to make suggestions towards the test’s improvement. To achieve this aim, the following research questions are set for exploration: Are the objectives, content and skill weight format of the final achievement computer-based MCQs test 1 compatible with the course objectives, the syllabus content and skill weight format ? To what extend is the test 1 reliable? What is the student’s attitude towards the final achievement Computer-based MCQs test 1? Scope of the study The existing final achievement Computer-based MCQs test 1 in the 4th semester for the second-year non-English majors at HUBT Theoretical and practical significance of the study Theoretically, the study proves that testing is crucial in order to measure and evaluate the quality of learning and teaching. Also, test reliability is one of the most important criteria for the evaluation of a test. Practically, the study presents how reliable the final achievement MCQs test 1 administered at HUBT is and how to improve its quality. Method of the study : Both qualitative and quantitative methods are used. Regarding literature review on language testing, course objectives, syllabus, the objectives, content and format of the achievement test 1 for 4th term, results of the questionnaires for students, qualitative method is applied. With reference to test scores and test items analysis, quantitative method is used. 1.6. Organization of the paper The study is composed of 6 chapters. Chapter 1- Introduction briefly states the rationale, aims and research questions, scope of the study, theoretical and practical significance of the study, method of the study and organization of the paper. Chapter 2- Literature review discusses relevant theories of language testing, final achievement test, Computer-based MCQ tests and test reliability. Chapter 3- The context of the study deals with English learning, teaching and testing situation at HUBT, course book, syllabus and check list for the test. Chapter 4- Methodology presents participants, data collection instruments, data collection and data analysis procedure. Chapter 5– Results and Discussions presents and discusses the results of the study. Suggestions for the improvement of the achievement test 1 are also proposed in this chapter. Chapter 6- Conclusion summarizes the findings, mentions the limitations and provides suggestions for further study. Chapter 2: Literature review 2.1. Language testing 2.1.1. What is a language test? There are a wide variety of definitions of a language test which have one point of similarity. That is to say, a language test is considered as a device for measuring individuals’ language ability. According to Henning (1987, p.1), “Testing, including all form of language test, is one form of measurement”. In his opinion, tests such as listening or reading comprehension are delivered in order to find out the extent to what the abilities of these skills are present in the learners. Similarly, Bachman (1990, p.20) stated: “A test is a measurement instrument designed to elicit a specific sample of an individual’s behavior”. He also considered obtaining the elicited sample of behavior as the distinction of a test from other types of measurement. Brown H.D (1995, p.384) presented the notion in a simpler way: “A test, in plain words, is a method of measuring a person’s ability or knowledge in a given domain”. He explained that a test first and foremost is a method which includes items and techniques requiring the performance of testees. Via this performance, a person’s ability or language competence is measured. These viewpoints show that a language test is an effective tool of measuring and assessing students’ language knowledge and skills and providing precious information for better future teaching and learning. 2.1.2. The purposes of language tests Language tests regarding their purposes are perceived from different perspectives by different scholars. Typically, Henton (1990) mentioned 7 points which can be represented as follows: Finding out about progress Encouraging students Finding out about learning difficulties Finding out about achievement Placing students Selecting student Finding out about proficiency In general, a language test is used to evaluate both teaches and students’ performance, to make judgment and adjustment to teaching materials and methods, and to strengthen students’ motivation for their further study. 2.1.3. Types of language tests Language tests can be classified into different types according to their purposes. Henton (1990), Brown (1995), Harrison (1983) and Hughes (1989) pointed out that language tests include four main types: proficiency tests, diagnostic tests, placement tests and achievement tests with characteristics illustrated in the following table: Type of test Characteristics Proficiency test Measure people’s abilities in a language regardless of any training they may have had in that language Diagnostic test Check students’ progress for their strengths and weaknesses and what further teaching is necessary Achievement test Assess what students have learnt as known syllabus Placement test Classify students into groups at different level at the beginning of a course Table 1: Types of tests Another researcher, Henning (1987) divided tests into objective and subjective ones on the basic of the manner in which they are scored. Subjective tests obtain scoring by opinionated-judgment on the part of the scorer while objective tests are scored by comparing examinee responses with an established set of acceptable responses or scoring key. 2.1.4. Criteria of a good language test Just like any measuring device, a language test presents potential error measurement. For the purpose of investigating and evaluating and “testing” a test, researchers such as Brown (1995), Henning (1987), Bachman (1990) and Harrison (1983) identified criteria to determine if a test is good or not. A good language test must feature four most important qualities: reliability, validity, practicality and discrimination. The reliability of a test is its consistency (Brown, 1995; Harrison, 1983). A test is reliable only when it yields the same results whether it is administrated under any circumstances or scored by any markers. The validity of a test refers to “the degree to which the test actually measures what it is intended to measure” (Brown, 1995, p.387). A test is considered to be valid if it possesses content validity, face validity and construct validity. The practicality of a test is administrative. A test is practical when it is time and money- saving. Also, it is easy to administer, mark and interpret. The discrimination of a test is the extent to which a test separates the students from each other (Harrison, 1983). In other words, it is the capacity of the test to discriminate among different students and to reflect individuals’ performance of the same group. 2.2. Achievement test 2.2.1. Definition Achievement tests are of extensive use at different levels of education due to their distinguished characteristics. Researchers define the notion of achievement tests in various ways. Henning (1987, p.6) held that: Achievement tests are used to measure the extent of learning in a prescribed content domain, often in accordance with explicitly stated objectives of a learning program. . From this definition, it followed that an achievement test was a measurement tool designed to examine language competence of learners over a period of instruction learning and to evaluate instruction program. In the same token, Hughes (1989) put that achievement tests were intended to assess how successful individual students, groups of students or the courses themselves have been in achieving objectives. Achievement tests play an important role in the education programs, especially in evaluating students’ acquired language knowledge and skills during a given course. 2.2.2. Types of achievement test Achievement tests can be subdivided into the final achievement and progress achievement according to the time of administration and the desired objectives (Henton, 1990). Final achievement tests are usually given at the end of the scho