DEVELOPMENT OF THREE-TIER DIAGNOSTICS INSTRUMENTS ON STUDENTS MISCONCEPTION TEST IN FLUID CONCEPT

The purpose of this research was to know the validity and reliability of instrument three-tier diagnosis to identify the students’ misconception in fluid concept. The stages of development research that used in this research were (1) The first research and collecting information, (2) planning and designing of development (3) validity and product evaluation. The instrument of the test was tested on 98 students of 3 schools. The instrument has been developed and categorized effectively because it has a valid category with the Aiken validity index value> from table V (0.75) and the reliability of the instrument has a value of 0.96 with a high category. The instrument of three-tier test that developed was able to identify the students' comprehension in students' concept and students' misconception. There are 27,58% students comprehend the concept, 45,29% students did not comprehend the concept, 24,74% students showed that misconception and 2,36% students experienced errors. © 2018 Physics Education, UIN Raden Intan, Lampung, Indonesia.


INTRODUCTION
One of the main problems that appear in physics education is students' misconceptions (Wijaya, Supriyono Koes, & Muhardjito, 2016).When the students learn about everything around them either through formal school education or nonformal education through daily experience, students tend to develop comprehension of something based on their own views.Because this concern, several researchers have done the research to describe the students' comprehension.The variety of comprehension types that are formed by students are called by several terms, such as "alternative conceptions", "misconception", "naive beliefs", "children's ideas", conceptual difficulties", "phenomenological primitives", "mental models" and many the others term (Gurel, Eryilmaz, & McDermott, 2015).Lack of comprehension through a concept can be improved by further instruction and learning, while misconception is believed to able inhibiting the acceptance and development of knowledge and students' ability (Hasan, Bagayoko, & Kelley, 1999).Misconception must be overcome because it can give negative effect in the further learning process (Djanette & Fouad, 2014;Lucariello, Tine, & Ganley, 2014;Sholihat, Samsudin, & Nugraha, 2017).Based on this thing, it is necessary to identify the students' comprehension of whether the students have understood the concept well, students have not understood the concept well or the students faced misconception.
The misconception is the condition where the students' comprehensions are different from the experts' comprehension (Resbiantoro & Nugraha, 2017;Wijaya et al., 2016).Misconception can be also identified as the concept that contrary to theories that are scientifically accepted and generally accepted (Gurel et al., 2015).Misconception can be identified in some methods such as using instrument diagnostic test that used to identified the students' misconception is Three-tier instrument (Zukhruf, Khaldun, & Ilyas, 2016).Three-tier tests are considered to have advantages in distinguishing the lower students' comprehension and students that faced fault incomprehension or misconception (Gurel et al., 2015).
Three-tier test instrument can identify students' comprehension of the concept and students do not need many times (Wijaya et al., 2016;Wisudawati, 2015).Besides the Three-tier test instrument can distinguish that students comprehend the concept, misconception, guesswork (do not have self-confidence), and do not the concept (Abbas, 2016).Besides it can identify the students' misconception, the three-tier test instrument also able to identify the students that comprehend the concept or less comprehend the concept (Aini, Ibnu, & Budiasih, 2016).The threetier instrument is that used in the second level.The test participants are still given an alternative answer and sometimes the test participant just guesses and it is difficult to know the difficulties cause that experienced by the students (Suwarto, 2013).In this research, the researcher wants to develop the three-tier test instrument that able to identify the students' difficulties in the learning that is by using the three-tier instrument with the open-minded reason.In preparing the instrument or test instrument need to be done accurately based on the rule that determined by education measurement and curriculum (Prihatni, Kumaidi, & Mundilarto, 2016;Siregar, 2014).The instrument that used an instrument of collecting data must be done validity besides to acquire suggestion and improvement that purpose to give assessment through items in the instrument (Suryani, Kartowagiran, & Jailani, 2017).Validity is evidence and theory through the interpretation of the test score based on the purpose of test usage (Pruyn, Watsford, & Murphy, 2016).

Diagnostic Formative Assessment
Assessment can be said as an act of collecting data in order to make decisions based on information obtained in the testing phase.Some of the assessment characteristics in learning such as the following: (1) Assessment is started by collecting some information about students in the learning; (2) In the assessment activity is done by analyzing and interpretation through data and information are that collected; (3) Interpretation results decisions about learning; (4) There is a follow-up to the decisions produced; (5) Assessment is carried out on an ongoing basis (Kusairi, 2012).Some knowledge about formative assessments include: (1) Formative assessment is a process carried out in learning; (2) Formative assessment results are not only used by teachers but also by students; (3) Formative assessment provides feedback on student learning and the learning process carried out by the teacher; (4) Feedback provided by formative assessment will be useful for students and teachers to make arrangements so that learning and learning can achieve curriculum goals (Hermawanto, Kusairi, & Wartono, 2013).With formative assessment in the form of objective tests, it can help students who have low learning outcomes to learn the subject matter as a whole.Formative diagnostic assessment is an instrument prepared to find out the learning outcomes that aim to identify or diagnose learning outcomes so that feedback occurs on the results of the use of formative diagnostic assessment.Tests of validity, reliability, level of difficulty, distinguishing power and the test index of the test instruments to identify student misconceptions involved 1 class from each school with a total of 98 students.In the testing phase of the instrument, it is intended to obtain information about the feasibility, adaptability, and functioning of the test instrument on field conditions.In addition, the trials carried out can also provide information about the validity, reliability, differentiation, level of difficulty and deception index of an instrument.The validity of an instrument does not apply generally to all measuring objectives.A test generally only produces a valid measure for a particular measurement goal.

METHOD The Research Procedure
Validity of the contents of a test instrument can be done by looking at the suitability of the indicators in the grid with the operational definition of the test instrument and with the operational definition of constructing instruments, determination of content validity can be In addition to the validity of an instrument, the reliability of an instrument is also much needed in the preparation of an instrument, this is because the reliability of an instrument shows the consistent validity of an instrument (Mohamad et al., 2015;Yasin et al., 2015).The reliability of an instrument is a determination of the results of a test, or if a change occurs, then the change can be said to be meaningless or significantly, Reliability itself can be said as the instrument's freedom of error and can produce a consistent result.The reliability of an instrument can be obtained through the results of testing instruments to test participants, in addition by testing the test instruments, information on the difficulty of the items, the item distinguishing power, and the index of the items about an instrument (Yamtinah, Saputro, & Utami, 2015).A test instrument is said to be reliable if the instrument items are consistent in measuring a person so that the results of measurements are performed on the same subject even at different times.There is no agreement in general what is the minimum coefficient of an instrument so that the instrument can be said to be reliable, but for a test to be used to decide about students, it must have a reliability coefficient of 0.70 (Fitriatun & Sukanti, 2016).Determination of the reliability of the assessment instrument developed in this study is to use the equation or formula Cronbach Alpha as follows (Fitriatun & Sukanti, 2016): Cronbach Alpha is generally used when we measure tests that have standard multiple-choice items or in the form of essays.Cronbach Alpha in principle includes measuring homogeneity in which it focuses on two important aspects, namely the content aspect and the heterogeneity aspect of the test.If the test item is heterogeneous, it means measuring more than one characteristic, traits or attributes and will cause the alpha coefficient to be lower.Conversely, if the test is more homogeneous then the alpha coefficient price will be higher which means the test is more consistent (Shirali, Shekari, & Angali, 2017).
Reliability is the stability of the score obtained by the same person when tested again with the same test in different situations or from a measurement of other measurements, so that reliability can be said as the level of consistency of the two measurements of the same thing.By using a reliable test tool, a person will have a constant score if given the same test even at different intervals, for which a good test instrument must also have a good test item difficulty level.
The level of difficulty of a question can show the level of quality of a test item.Test items can be stated as items that are good if the items are not too difficult and not too easy, in other words, the degree of difficulty of an item is moderate or sufficient.Good questions are questions that are not too easy and not too difficult, questions that are too easy do not stimulate students to improve their efforts to solve them and vice versa if the problem is too difficult will cause students to despair and not have the enthusiasm to try again.Equations are used to determine the level of difficulty with the correct proportion of answers (Fitriatun & Sukanti, 2016): P=   (Arikunto, 2013) Distinguishing power is one of the points that must be considered in the preparation of instruments or questions in the analysis of learning outcomes.Analysis of the differentiation of items was conducted to determine the difference power which can be seen from the discrimination index value for each item.
The differentiating power of a problem is the ability of a question to distinguish between students who are clever (highly capable) with students who are stupid (low ability).The distinguishing factor of a test functions to determine whether or not a question can distinguish groups in aspects measured according to differences in the group, in principle the distinguishing index is calculated on the basis of group division into two parts, namely the upper group which is a group of capable test takers high with the lower group, namely the low-ability group of test takers.Different power analysis can use equations: Excellent (Arikunto, 2013) The answer pattern in a question is known by calculating the number of tests who chose each option provided (Fitriatun & Sukanti, 2016).A deception index can be said to be good or has performed its function well if it has such attractiveness that the test participants feel hesitant and hesitant so that in the end, they become fooled to choose a distractor as the correct answer.In general, what applies to the evaluation of learning outcomes is that distractors have been able to carry out their duties well if the distractors have at least been chosen by 5% of all test participants (Fitriatun & Sukanti, 2016).Distractor efficiency (DE) is calculated based on the number of NFD in an item and ranges from 0 -100%.NFD is choices other than keys when chosen not often by respondents (<5%), and does not carry out their functions (D'Sa & Visbal-Dionaldo, 2017;Mahjabeen et al., 2018).The instruments used in this research used conceptual diagnostic tests consisting of three levels or three-tier tests.The three-level diagnostic test instrument used in this research consisted of 28 items from 11 fluid concepts.

Research
to analyze students' misconceptions about fluid concepts is carried out in several stages, the initial stage of data collection, data processing, description and discussion of findings in the study.Data analysis uses the concept of analysis compiled by Kaltakci (Kamilah & Suwarna, 2016).

RESULTS AND DISCUSSION Validity
An instrument that has been prepared in draft form before used it must be validated.The purpose of validation is to get feedback, criticism, and suggestions for improvement in accordance with the expertise of each validator.Expert validation aims to provide an assessment of the instruments that have been prepared.The assessment carried out by the validator can be in the form of conformity with the indicators that have been prepared, the suitability of the material, conformity of the choice of answers, the suitability of the language or conformity of the instrument as a measuring instrument.
An instrument has that prepared to be validated by 4 experts and 4 teachers, the data were analyzed by using Aiken equation.Based on the validity analyzing was getting the data that 28 questions that were prepared and had the valid category where for the Aiken validity index more than V table for 8 validators is 0.75.Based on the validity analyzing was getting the data that:  ,2,3,4,5,6,7,8,9,10,11,12,13,14 ,15,16,17,18,19,20,21,22,23,24,25,26,27, Based on the results of the analysis of the validity of the items it was found that 28 questions were stated in the valid category where for the Aiken validity index results for 8 validators higher than V tables or higher than 0.75.If the item is valid then the question is feasible to use.

Reliability
The reliability analysis of the items was carried out on 28 questions that had good validity, based on the results of the reliability analysis using the Cronbach Alpha formula with the help of the Quest program, it was obtained data that: Reliability analysis of the questions was carried out on 28 valid questions.Based on the results of the reliability analysis using the Cronbach Alpha formula, the reliability of the questions was 0.96 with a high category.

Difficulty level
Difficulty level analysis of test items was carried out on 28 questions that had been tested for validity.Based on the analysis of the difficulty level of the test items, it was found that:  4,6,7,8,10,11,12,13,15,16,17,18,19,21,22,23,27,28 Medium 9,14,24,25,26 Difficult Based on the results of the data analysis of the difficulty level of the test items shown in the table, it can be seen that from 28 test questions that have been tested, information is obtained that the difficulty level of the test questions is divided into three categories, namely easy, moderate, and difficult.Questions that have an easy level of difficulty are shown no 2,3,5 or with a percentage of 10.71%, questions that have a level of difficulty with the category being shown no.1,4,6,7,8,10,11,12,13,15,16,17,18,19,21,22,23,27,28 or with a percentage of 71.42% and questions that have a level of difficulty with difficult categories are shown no 9,14,24,25,26 or with a percentage amounting to 17.85%.

Differentiating Power
Differential power analysis was conducted to determine the quality of the items in distinguishing between the upper groups who answered correctly and the lower groups who answered correctly.Distinguishing power analysis was carried out on 28 items that had good validity.Based on the distinguishing analysis, data were obtained that:  ,3,5,7,14,24,25 Bad 2,4,6,10,11,13,15,18,20,21,27,28 Sufficient 8,9,12,16,17,19,22,23,26 Good Based on the results of the analysis of distinguishing data shown in the table, it can be seen that out of 28 questions compiled and tested on students, information was obtained that the differentiating problem was divided into 3 categories, namely bad, sufficient and good.Test items that have a differentiation with a bad category are numbered 1,3,5,7,14,24,25 or with a percentage of 25%, test items that have distinguishing features from the category are simply indicated number 2, 4,6,10,11,13,15,18,20,21,27,28 or with a percentage of 42.85% and test items that have distinguishing features with good categories shown in numbers 8, 9,12,16,17,19,22,23,26 or with a percentage of 32.14%.

Deception Index
The deception index analysis was carried out to find out how the functioning of the alternative answers to the errors given to the multiple-choice test items.Based on the analysis of the alternative index, the answers to the test items were found that:  Based on the results of the analysis of the alternative answers to the deception data shown in the table, it is known that from 28 test items with 5 alternative answers on each test item, it was found that alternative answers b, c, and d on test items number 2, b, c and d on Test items number 5 and a, c and d on item number 8 have not had good or not selected deception at least 5% of the total test takers.

Conception of Students
Based on the results of student misconception analysis of the fluid concept using the three-tier test instrument, the data obtained that: Table 14 showed that from 98 students in SMA XI IPA grade was gotten data that 27,28% students understood the fluid concept.45,29% students have not understood the fluid concept, 24,74% students have misconceptions and 2,36% students have an error.Based on the analyzing of data identified students' misconception, there are many students that faced misconception.The difference between students that did not understand and the students and students faced misconception lies in the students' belief about their answer was given.If the students believed through their answer is given but their answer was wrong so that the students were categorized as a misconception, even though if the students did not believe with their answer given so that the students were categorized that they did not understand the concept.
Misconceptions that occur in students are not limited simply because their answers are wrong, however, based on an analysis of understanding the concepts of students with three-tier test instruments with open reasons or students not given alternative answers, it is found that students find the difficult to give reasons for their answers given it to the first level question.It shows the difficulty of students in communicating in conveying what they understand.

Figure 1 .
Figure 1.The Research Procedure

Table 1 .
Interpretation of reliability values refers to Guilford's opinion

Table 2 .
Difficulty level categories

Table 6 .
Validity Analyzing Items

Table 7 .
Reliability analysis of test items

Table 8 .
Analysis of the difficulty level of test items

Table 9 .
Percentage of the difficulty level of test items

Table 10 .
Distinguishing power Analyzing

Table 11 .
Percentage of Distinguishing power

Table 12 .
Analyzing of deception alternative answer

Table 14 .
Percentage of students' concept comprehension level about the fluid concept