 |
|
 |
| |
Archive for the ‘Reliability of Psychometric Tests’ Category
Wednesday, July 28th, 2010
In this session we will explore the following:
1. The relationship between reliability and validity in psychometric assessment
2. How psychometric test administrators can impact the reliability of tests
Psychometric Test Reliability
When choosing a reputable test, whether it be aptitude or personality, one of the properties of the test you will need to look for is reliability. We’ll consider reliability in appropriate detail in a later section of the course. For now, think of reliability as consistency. In order to have absolute confidence in our test scores we need them to be consistent. However, we can’t test and retest our candidates in the real world. Despite this, reputable test publishers would already have done this for you. This would have been carried out under optimal conditions. So, now you know that you are using a reliable test (one that produces consistent scores), it’s your task as the test administrator to ensure that the test remains a reliable test.
Why is reliability so important?
Whenever you assess something, you expect the score you get to be reliable. For example, if you assess your weight using bathroom scales, you expect the reading you get to be consistent across at least the short term. If you weigh yourself over 2 consecutive days and get significantly different readings you know something is wrong with the scales! The same is true of psychometric tests. The publisher first ensures that the test scores will be consistent over time and then you, as the administrator, need to ensure that your actions do not make the test less reliable.
Not only do we want and expect test results to remain reliable over time, but we also know that reliability is a precursor to validity. It sets an upper limit on the test’s validity. In other words, if your test is not reliable then it is not valid. Confusing? Let’s use the weighing scales example again…
Let’s suppose a medical doctor does some research which shows that those who weight more than 120kg are significantly more likely to suffer a heart attack. His research shows that weight is a valid indicator for predicting the heart attack. The scales are fit for the purpose of predicting a heart attack. Validity is all about being fit for purpose. Now if those scales are not reliable, they will provide inconsistent data over the time of the research program. In this case would you have confidence in the doctor’s findings? Of course not!
So, to apply this to psychometric tests let’s take an aptitude test. We’ve carried out research which confirms that a new numerical reasoning test can predict the performance of accountants. Those who score better on the test are rated as better accountants. This is validity. The test is fit for the purpose of predicting accountant performance. You will hopefully have full confidence in this finding if you know the test is reliable. If however you expect the test is coming up with inconsistent scores for your candidates, it is unreliable, and, as in the scales example above, you will not have confidence in the test’s prediction of accountant performance. This is why reliability is a precursor to validity.
And why is all of this so important for this course? It’s because you as the test administrator can enhance or reduce the reliability of the test by how you administer it in the first place. Let’s now take a look at what factors you can and can’t influence in terms of reliability.
How psychometric test administrators can impact the reliability of tests

- Factors Affecting Psychometric Test Reliability (C)2010 PsyAsia International: No Copying
Take a look at the graphic on the left. It shows different factors which can impact the reliability of psychometric tests. This applies to both aptitude tests and personality assessments.
Factors within the test
Generally, a test administrator is not responsible for this. The test publisher must design tests that will be highly reliable. Factors within the test means that the questions chosen must be accessible to all groups for whom the test is intended. If a subsection finds some questions difficult based on their group membership (i.e. non-native-English speaking groups may not understand a colloquialism used in a test question), then the test will be less reliable for that group. Although the publisher needs to ensure a reliable test, not all test publishers are reputable or know what they are doing! This is why the person who purchases the test needs to know how to evaluate it. We’ll show you later how to evaluate the test in greater detail. Know for now that you do not evaluate a test or validate it by trialling it on yourself or your colleague as many untrained users think!
Factors within the respondent
Whilst the test administrator cannot control all the possible factors within a respondent, you can do your best to ensure you control for a much as possible. It’s a good idea to think here about how you would like to be treated if you were undergoing a psychometric assessment for the first time. You’d probably like a friendly invitation letter explaining what is going to happen and why. You’d like to know that your data and results will remain confidential and only shared with decision-makers and only for the purpose that you’re undertaking the test. You’d also like to know what you need to bring with you and if possible, a few example questions as approved by the test publisher might help to set your mind at rest. Finally it would be good to have a number to call should you have any special needs that you wish to convey to the administrators before the day. So, when you arrive at the test centre you already know what is going to happen and why, you won’t be overly concerned, you’ll have all the right things with you (e.g., reading glasses) and you’ll know how long the session is going to last. If it’s a personality test you’ll be more likely to be open and honest because you know your results won’t go further than the selection or development committee and won’t be used for reasons beyond the reason you’ve already been given.
Ultimately here you are attempting to control for mood and expectations. Ideally you don’t want these to vary between candidates in order to give everybody the same start line. On the actual day of the test you will go over all of these things again with the candidates in the room to ensure that they are all clear on what will happen and why. Again, this sets the scene and mood, demonstrates your organisation’s “humanness” in the assessment process and provides candidates with an opportunity to ask questions. Furthermore, on the day you will need to ensure that you administer the test instructions word for word and then administer the test exactly as intended by the test publisher. Doing all of this enhances consistency and thus increases reliability. This is essential as we saw before because reliability is the precursor to validity.
Factors within the environment
How well would you be able to complete an aptitude test in a noisy room? Or how about room that’s freezing from too much air conditioning or too hot due to broken air conditioning? Likewise, you need to ensure that the test environment is conducive to candidate performance each and every time. This applies to personality assessment too. Although there is no right or wrong, your candidate will certainly feel more able to make an effort and respond accurately if you provide them with the right environment! So, some time before the session you’ll need to check the room, make sure temperature controls work. On the day, switch them on in good time before the test so that by the time candidates arrive the room is just right. Place a sign on the door to ensure you are not disturbed during the testing session and be sure to silence all phones in the room. Candidates should of course have phones switched off too. Ensure that once the session is over, all candidates leave at the same time so that they do not disturb others. If a candidate really must make a restroom visit, they should be accompanied by an administrator and only one candidate at a time should go. Ensure that upon leaving and rejoining the room the candidate does not disturb others.
(Note: also a good idea to check there is no planned construction nearby and there are no fire drills scheduled on the day of testing. Do this before sending out your invitation to the candidate!)
Summary
By referring to these guidelines you’ll help to ensure that psychometric tests used by your organisation remain as reliable as the publisher intends them to be. By using short-cuts and not following the guidelines you’ll threaten the reliability and therefore the validity of the tests. If you threaten a test’s validity it becomes unfit for purpose which means your company is wasting its money buying psychometric tools!
Interested in learning more about psychometric testing for HRM? Keep reading – your next free session is not far away! To ensure you don’t miss a single instalment, we suggest you follow-us on twitter as each new post will be announced there. You may also like to join our face-to-face psychometric training courses in Singapore or Hong Kong – these range from simple introductory courses through to Certification Courses such as the BPS Level A and BPS Level B Certificates of Competence in Occupational Testing. Not in Singapore or Hong Kong? No problem – we also offer both recorded and live online training in psychometrics! For full details please see here or email us.
DO NOT COPY OR SAVE THIS ARTICLE TO YOUR COMPUTER.
THIS ARTICLE IS CLEARED FOR PUBLISHING ON PSYCHOLOGY1 GROUP SITES ONLY. IT REMAINS COPYRIGHT AND INTELLECTUAL PROPERTY OF PSYASIA INTERNATIONAL PTE. LTD. YOU ARE NOT AUTHORIZED TO PUBLISH IT ON ANY OTHER SITE. YOU ARE NOT PERMITTED TO COPY/PASTE THIS ARTICLE OR TO SAVE IT TO YOUR LOCAL DRIVE. YOU ARE ONLY PERMITTED TO READ IT ONLINE AT OUR WEBSITE. VIOLATION OF THESE TERMS WILL RESULT IN BANNING OF OFFENDING IPS AND LEGAL ACTION FOR THOSE WHO REPUBLISH THIS ARTICLE WHETHER IT BE WITH OR WITHOUT A REFERENCE TO THE ORIGINAL AUTHOR.
Tags: bps certificates of competence singapore, bps level a singapore, bps level b hong kong, bps level b singapore, choosing psychometric tests, hrm and psychometrics, level b occupational testing singapore, online psychometric test training, online training in psychometric tests, personality test training singapore, psychometric assessment singapore, psychometric test training hong kong, psychometric tests and human resource management, Reliability of Psychometric Tests Posted in BPS Level A & B Certificates, BPS Level A Certificates, BPS Level B Certificates of Competence in Occupational Testing, Competence in Psychometric Testing, Online Psychometric Training Mini-Course, Psychometric Test Training, Psychometric Tests in HRM, Reliability of Psychometric Tests | No Comments »
Wednesday, July 14th, 2010
In this session we will explore the following:
1. Why psychometric tests are used and how they are useful. We will do this by referring mainly to alternative methods of assessment.
The short answer to the first part of the above question is that psychometric tests are used because (assuming they are well designed tests) they are a reliable and valid means of assessing people. We will discuss in a future session exactly what is mean by reliability and validity when applied to psychometrics.
Let’s consider a few alternatives to psychometric tests and highlight this issue further.
Unstructured Interviews
Most candidates who apply for a job will expect to have an interview at some stage of the process and indeed, most organisations will work an interview into the process. However, how useful is this interview for predicting performance on the job? This depends a lot on the training of those who will be interviewing. Many people who conduct interviews have never been trained. Perhaps one day a boss asked them to go and interview a candidate for a job and it continued from there. They may have years of experience but experience and competence are not the same. Most people who interview use what is known as the traditional interview. It is also sometimes called an unstructured interview. The idea is that this is a time to meet with and get to know the job applicant. Often the interviewer is thinking things such as:
“Let’s see if he has a firm handshake.“
“Let’s see if he looks me in the eye.”
“I’ll ask him what he does in his spare time.”
The problem is that none of the answers to these questions will predict performance at work. So what if I have a limp handshake? Donald Trump (very successful property tycoon) does not even like to shake hands – he’s worried about germs! Imagine him at a job interview. The shake would be very limp if at all. In some cultures it’s rude to look people in the eye – so we cannot go assuming that those who avoid eye contact will not be good performers or that they are dishonest or hiding something. As for spare time, what about somebody who puts together model cars or aeroplanes on the weekend, does it mean that will be a good designer or engineer. No, this may simply be a low level weekend interest and not something that would keep them entertained as a career. Not to mention the fact that in some parts of the world it’s actually illegal to ask about people’s hobbies in a job selection process!
The point to grasp then is that often the people conducting interviews have little or no training and are running unstructured interviews that have little relevance to job performance and therefore lack both reliability and validity. However, the suggestion is not that we remove interviews totally!
Structured Interviews
Research has shown that interviews have good reliability and validity when run in a particular way by those who have undergone thorough training. These are called structured interviews. The idea here is to align the interview questions to the competencies required of the candidate to be successful in the job. Then the interviewer asks the same or very similar questions to each candidate based on job requirements. Behavioural interviews are one type of structured interview. The questions are designed to elicit a high level of evidence that the candidate has displayed the behaviour associated with competent performance over repeated occasions in the past. Another type of structured interview is Situational interviewing – here the candidate is asked what they would do in certain situations. Situational interviews are generally less valid than Behavioural interviews. The biggest problem with getting HR and Consultants to run structured interviews is the need for training. PsyAsia used to run a 2-day course in behavioural interviewing, but our clients in Asia told us that would require too much time out of the workplace. We thus reduced this to a one-day course (see our behaviour-based interviewing course here if interested) but whilst this satisfies the big decision makers it really only serves as an introduction to interviewing. There needs to be more communication and understanding between HR and those who hold the purse-strings in Asia if we are to increase competence in this area!
Psychometric Tests and Structured Interviews
So thus far, we pointed out that interviews can be reliable and valid but that can only happen if the interviewers have been appropriately trained and where using structured interviews; preferably a behavioural interview. Those using psychometric tools also need to be appropriately trained in order to ensure they remain reliable and valid tools. Assuming training and competence requirements are met for both tests and interviews, why use tests?
Psychometric tests are able to cover a lot more ground in far less time. Aptitude tests give us an indication of numerical, verbal and spatial skills in 18 minutes if using modern tests like the Saville Consulting Aptitude range. There’s no way we could discover this information in even a one-hour interview! Personality assessments can sample and assess personality traits relevant to performance on the job. The average completion time for good personality assessments is 30-40 minutes. There also a few good faster tools available which take around 20 minutes. The amount of information gleaned in this short period of time is a credit to the developers of psychometric tests. However, with particular regard to personality testing, it is necessary to confirm the profile with behavioural evidence from the candidate. So, whilst the profile may suggest somebody who really enjoys multi-tasking, this becomes a basis for an interview question (assuming this is required by the job).
In essence then, psychometric tests are useful because they provide so much more information than an interview can provide in a much shorter period of time. They have been designed by experts using modern statistical techniques aligned with modern personality research and theory. However, psychometric tests are only part of the story and a well designed interview using competent interviewers will add incremental validity to the assessment process. The interview will serve to confirm (or refute) the psychometric profile and provide rich behavioural evidence (that cannot be recorded by psychometric tests) that the person can perform at the level required by the person specification.
Other Methods of Assessment
So far we’ve only looked at different types of interview as an alternative or as complimentary to the assessment process. How about other methods of assessment?
Application forms
We all need to complete one of these to show our intention to apply for a job. Realistically though they are there for this reason alone. They serve as a record of information which the organisation deems important to hold on the individual. Current application forms hold no value as selection tools with the exception perhaps of educational and experiential background. This can be changed by designing application forms that elicit only job relevant responses and preparing a scoring system for the from even before sending it out.
CV/Resume
Candidates like to send their CV/Resume because many people have these on file and it’s easy to quickly update it and print it off on a per-job basis. However, again these are not particularly useful in selection. Research shows that decision-makers are often seduced by smart graphics as well as vocab which sells the applicant by over-inflating their achievements. It’s also possible to lie in a CV, although research has shown that most people don’t lie about their educational qualifications or experience as they know the prospective employer can check up on this. What they do tend to lie about or at least mislead about is their level of competence. We suggest that CVs are not used at any stage of the selection process.
Assessment Centres (ACs)
This is where the candidate is invited to a physical location to partake in a number of exercises with other candidates. Most ACs last a day and during that time the candidates will undergo both group and individual exercises such as presentation exercises, negotiation exercises or in-tray exercises. Assessment Centres have been shown to be highly valid and reliable methods of selection when using well trained assessors.
PsyAsia runs training in Assessment Centres and we also offer consultancy in Assessment Centre Design
References
References lack validity in the assessment process and yet organisations continue to request them! Typically a candidate will not give a potential employer the name of somebody who will give them a poor or perhaps even an honest reference. The tendency is to only offer names of those who they trust will give a great reference. On the other hand, if the current employer really wants the candidate to move on they may fake the reference, making the candidate appear almost angelic! Does this mean we should not use references in the selection process? No. It is possible to improve upon the use of references by designing work–related reference forms that elicit behavioural evidence from the previous employer that is in line with the competency requirements of the new job. However, this may lower the response rate as the referee really needs to think about actual behaviours and write them down rather than sending the standard “he’s a great guy” reference.
Graphology
Most organisations aren’t into this, but an alarmingly high percentage of French organisations are! The idea here is that various personality traits can be seen via somebody’s handwriting. Those traits can then be linked to performance at work. So for somebody that writes with very bold strokes, the graphologist may say they are ambitious. This would be good for a salesperson. However, research has shown a lack of reliability in this method. Not only do people write differently depending on their mood, their culture, their upbringing and so on, but graphologists given the same handwriting to analyse often do not agree with each other about the personality traits of the writer! Graphology thus should not be used as a selection tool.
Phrenology
Phrenologists assume that different aspects of personality are stored in different parts of the brain and that where somebody has more of a particular characteristic, the corresponding part of the brain will be larger and hence cause protrusions on the head! The idea would be that you measure different bumps and indentations on your candidates and then project their personality from that. Of course, this method holds no validity and brain imaging tools such as fMRI and PET scans have refuted it.
Astrology
In Asia, people use astrology to help them decide auspicious dates for business openings, functions, weddings and so on. Does it work for job applicants? No! The idea that people born at the same time, in the same place, where the alignment of stars and planets are similar will work in the same way does not hold any weight. Don’t hire employees based on their star signs!
Psychometric Tests and other Selection Methods
As you can see, there are many ways we can assess people. However each method varies in terms of reliability and validity. Assessment Centres hold very high reliability and validity if done properly, but they are expensive, require lots of resources and skills to run and only assess 6-12 people at a time. We’ve already said that structured interviews are good but again, they take time and resources. Psychometric tools do cost money. However the cost is offset by the number of candidates that can be assessed and the information that can be gathered in the assessment compared to other selection methods. Don’t forget, an interviewer’s time is costly. A panel interview with 3 interviewers is likely to cost around 2-3 times the fee of a psychometric test and yet will not gather as much information. Not to mention the fact that if you are using the right psychometric tool, it’s reliability and validity will already have been assessed and will be good. Whereas we tend to assume that interviews will be reliable and valid if run by trained people – this is rarely tested!
Psychometric Tests for development, coaching, careers advice and team-building
This lesson has focussed on the use of psychometric tests in candidate selection. However, much of what has been raised applies to the use of tests in other scenarios. For example, in careers advice, psychometric tools allow the counsellor to offer advice which is based on a systematic assessment of the individual’s aptitude and personality alongside the information already on file such as achievements thus far, previous experience, educational qualifications and so forth. In coaching, development and team-building, psychometric tools often serve as a reliable and valid basis for the discussion. Not using these tools means the initiator starts off with far less information and is likely to be less systematic. Psychometrics enables the initiator to work from a validated model and a holistic assessment of the people being developed and not to base interventions and advice on subjective insights.
Interested in learning more about psychometric testing for HRM? Keep reading – your next free session is not far away! To ensure you don’t miss a single instalment, we suggest you follow-us on twitter as each new post will be announced there. You may also like to join our face-to-face psychometric training courses in Singapore or Hong Kong – these range from simple introductory courses through to Certification Courses such as the BPS Level A and BPS Level B Certificates of Competence in Occupational Testing. Not in Singapore or Hong Kong? No problem – we also offer both recorded and live online training in psychometrics! For full details please see here or email us.
DO NOT COPY OR SAVE THIS ARTICLE TO YOUR COMPUTER.
THIS ARTICLE IS CLEARED FOR PUBLISHING ON PSYCHOLOGY1 GROUP SITES ONLY. IT REMAINS COPYRIGHT AND INTELLECTUAL PROPERTY OF PSYASIA INTERNATIONAL PTE. LTD. YOU ARE NOT AUTHORIZED TO PUBLISH IT ON ANY OTHER SITE. YOU ARE NOT PERMITTED TO COPY/PASTE THIS ARTICLE OR TO SAVE IT TO YOUR LOCAL DRIVE. YOU ARE ONLY PERMITTED TO READ IT ONLINE AT OUR WEBSITE. VIOLATION OF THESE TERMS WILL RESULT IN BANNING OF OFFENDING IPS AND LEGAL ACTION FOR THOSE WHO REPUBLISH THIS ARTICLE WHETHER IT BE WITH OR WITHOUT A REFERENCE TO THE ORIGINAL AUTHOR.
Tags: bps level a & b training, online psychometric course, online psychometric test training, psychometric test singapore, psychometric testing hong kong, psychometric training Posted in BPS Level A & B Certificates, Competence in Psychometric Testing, Online Psychometric Training Mini-Course, Personality Tests, Psychometric Test Training, Psychometric Tests, Psychometric Tests in HRM, Reliability of Psychometric Tests, Validity of Psychometric Tests | No Comments »
Wednesday, May 19th, 2010
| |
Join us for a Webinar on June 22 |
| |
|
|
 |
| This is a FREE webinar in PsyAsia’s HRM themed webinar series. In this session we are pleased to present research on and answer questions about whether or not Chinese people are significantly different to other major groups and whether any potential differences are likely to impact upon the ability of personality tests to predict performance at work.Some HR people in Asia believe that culture plays such a significant role in personality that indigenous personality attributes need to be assessed at recruitment/selection. To this end, personality tests have been developed “in Chinese for the Chinese by the Chinese”. A significant question to ask is: Do these tests add any prediction over and above that afforded by mainstream personality tests developed by world renowned experts in the field?The above questions will be answered through discussion of the trait model of personality and its biological basis. Peer-reviewed and published research conducted by PsyAsia International’s award-winning Psychologist, Dr. Graham Tyler; award-winning Dr. Peter Newcombe of the University of Queensland; and world-renowned Professor Paul Barrett, formerly of the University of Auckland will be presented in an easy to understand format.
As always, the webinar is open to all HR and related professionals in our region. It is not open to competitors. You must provide your corporate email address when registering – we do not approve free email accounts such as yahoo/google/hotmail/rediffmail etc.
All attendees who remain for the entire session will receive a free pdf Certificate of Professional Development. Hard-copy certificates can also be requested for a fee. |
| |
| Title: |
|
Chinese Personality at Work – How Chinese are the Chinese? |
| |
| Date: |
|
Tuesday, June 22, 2010 |
| |
| Time: |
|
5:00 PM – 6:00 PM SGT |
|
| |
| After registering you will receive a confirmation email containing information about joining the Webinar. |
| |
| System RequirementsPC-based attendeesRequired: Windows® 7, Vista, XP, 2003 Server or 2000 |
| |
| Macintosh®-based attendeesRequired: Mac OS® X 10.4.11 (Tiger®) or newer |
| |
|
Tags: asian personality at work, chinese personality, choosing psychometric tests, hong kong psychometric tests, personality and performance at work, personality test china, personality test distributor, personality test hong kong, personality test singapore, personality test training singapore, psychometric assessment singapore, psychometric course hong kong, Reliability of Psychometric Tests, singapore psychometric tests, using psychometric test results Posted in Error in Psychometric Tests, Human Resource Management, Personality Tests, Psychometric Test Research, Psychometric Test Training, Psychometric Test Webinars, Psychometric Tests, Reliability of Psychometric Tests, Validity of Psychometric Tests | No Comments »
Wednesday, March 17th, 2010
The Amazing Apollo Profile
This free webinar will be facilitated by Mr. Jim Bowden, the developer of the Apollo Profile. The session will be interactive (provided attendees kit themselves out with headphones and a mic!) and Jim will present numerous interesting case studies.
The webinar will cover the following:
• Introduction: The Amazing Apollo Profile- can transform Recruitment, Staff Development, and Organisation Performance –Client example
• Apollo Questionnaire – valid/reliable/comprehensive
• Why is Apollo amazing? Apollo Advantages
• Using and interpreting of Apollo reports with anecdotes
• Recruitment – Accurate, easy, low cost – Case Study using Apollo Best Match in China for filtering 12,000 applicants for 40 Graduate level jobs
• Training and Development – Unique Apollo report PLUS downloadable solutions. Convenient, low cost, motivating
• Organisation Development. Benchmarking: Can analyse and identify current corporate strengths and weaknesses – then create high performing models/culture, identify engagement issues – case studies
• Customising: Develop models that work specifically for your organisation. If your organisation is serious about leadership through people.
• Integrate everything together with flexible multi-purpose Internet Online solutions. Use your own competencies frameworks and vocabulary – examples
• Special Offer – have to listen to Webinar to find out!
Date: Monday, May 17, 2010
Time: 12:30 PM – 1:30 PM SGT
After registering you will receive a confirmation email containing information about joining the Webinar.
System Requirements
PC-based attendees
Required: Windows® 7, Vista, XP, 2003 Server or 2000
Macintosh®-based attendees
Required: Mac OS® X 10.4.11 (Tiger®) or newer
Space is limited.
Reserve your Webinar seat now at:
https://www1.gotomeeting.com/register/522465752
Tags: Apollo Profile, choosing psychometric tests, personality assessment, personality questionnaire, personality test distributor, personality test training singapore, Personality Tests, personality trait, psychometric assessment singapore, psychometric course hong kong, psychometric course singapore, psychometric news, psychometric personality tests, psychometric test distributor, psychometric test singapore, psychometric test training hong kong, psychometric test training malaysia, psychometric test training singapore, singapore psychometric tests, using psychometric test results Posted in Human Resource Management, Online Psychometric Test Systems, Personality Tests, Psychometric Test Research, Psychometric Test Training, Psychometric Test Webinars, Psychometric Tests, Psychometric Tests in HRM, Reliability of Psychometric Tests, Validity of Psychometric Tests | No Comments »
Wednesday, February 10th, 2010
Identity Questionnaire Research Results – A synopsis
No reproduction without permission.
Introduction to the Study and Outline of the Phases
In September 2008, Quest Partnership Ltd, PsyAsia International, and the Hong Kong Institute of Vocational Education (HKIVE) embarked on a project to translate the Identity Self-Perception Questionnaire from English into traditional Chinese. The reason for translating the questionnaire was to produce an occupationally focused personality questionnaire that could be used in China and Hong Kong SAR. At the same time, Quest were also producing a new Careers Report for the Identity system. This enabled the volunteer students to gain useful feedback on their questionnaire. The project was headed up by Max Choi of Quest Partnership Ltd and Dr. Graham Tyler of PsyAsia International. Max Choi is an Occupational Psychologist with BPS chartered status and has substantial experience in designing and validating tests. Graham Tyler is a registered psychologist and has a PhD based on psychometric assessment and validating tools for predicting performance at work in Asia.
The research was split into several stages:
• Translations – involving the translation and back-translation of Identity into Simplified and Traditional Chinese by professional staff at HKIVE.
• Pilot Study – using the translated Identity questionnaire.
• Phase 1 Testing – a sample of participants at HKIVE completed the Chinese Identity questionnaire.
• Phase 2 Re-testing – participants were asked to complete the questionnaire for a second time one month later i.e. re-testing to determine the reliability of the questionnaire items.
• Data Cleansing – first to identify and remove ‘rogue’ answer sheets from students who did not complete the questionnaire seriously.
• Data Analysis & Results– analysis of the data and understanding the results.
• Producing Norms and Building this into the New Career Focus Report – norms were produced based on these Hong Kong students. This norm group was used for the new Career Focus Report which is now available for the Hong Kong education sector.
• Translation into Simplified Chinese – the project to translate the Identity Questionnaire into simplified Chinese and have it available online was completed in December 2009.
Translations
In September 2008, the questionnaire was translated into both Traditional and Simplified Chinese by 4 individuals at the HKIVE who hold the British Psychological Society’s Level A and B Certificates of Competence in Occupational Testing. This process was supervised by Dr. Graham Tyler, who has a good understanding of principles behind item construction. The translated questionnaire was sent to the test publishers (Quest Partnership Ltd) in the UK for evaluation and further refinement, working with Chinese natives now resident in the UK.
The translated questionnaire was then back-translated into English by lecturers in the English language department at the HKIVE. Independent back-translation provides the quality check of how effective the translation has been. The back-translation was checked against the original version of the questionnaire to ensure it retained its overall theme and meaning. A few items achieved poor back-translations and these were reviewed and improved and back-translated again to check that the translation had improved. The traditional Chinese translation took precedence on the basis that this would be evaluated first and then simplified Chinese would follow at a later date.
Pilot Sample
20 students at VTC completed the translated traditional Chinese questionnaire. They also completed a form which collected their feedback on items that they did not fully understand or where they felt the wording could be improved. This feedback was analysed and a few minor improvements were made for the next phase.
Phase 1 Testing
In October 2008, a large sample of 800+ Chinese students at HKIVE completed the Traditional Chinese Identity Questionnaire. Most of these administrations were conducted under standardised test administration conditions during classes. The final sample after data cleansing consisted of 421 students.
Phase 2 Testing
One month later many of the Chinese students from the Phase 1 testing were invited to complete the questionnaire again. The test-retest study is based on 206 students who completed the questionnaire again. Most of these administrations were conducted under standardised test administration conditions during classes.
Students were entered into a monetary prize draw as an incentive to take part in the research. Also, students received a Career Focus Report from their completed questionnaire.
Data Cleansing
Identifying ‘Rogue’ Responses
We placed stringent requirements on the data that could be used. It was evident that a proportion of the student responses were not usable. This may be as a consequence of asking the students to complete the questionnaire as part of class work. So although they were volunteers, the request during class time may have resulted in some slightly ‘reluctant’ volunteers. Also, others may have become bored after starting the questionnaire and may not have taken the whole questionnaire seriously, unlike real candidates applying for jobs. So a small minority will complete the questionnaire in a non-serious manner. Only a few rogue answer sheets can be visually identified (e.g. students who have put in the same response for the whole column or making neat zig-zag patterns on the answer sheet). So we needed to employ more sophisticated techniques to identify other ‘rogue’ respondents in order to remove these from our sample before conducting further analysis on the data.
Removing Answer Sheets with Too Many ‘3’ Responses
The instructions for completing the questionnaire clearly states that 3 should be used sparingly. But for this Chinese student sample, the mean number of ‘3’s chosen was 30.4, with a Standard Deviation (SD) of 32. For our UK sample however, the mean number of ‘3’s chosen was 9.85, with a SD of 15. It was decided that participants who responded with over 71 unsure ‘3’ responses would be removed from the sample i.e. this means that they are putting down ‘3’ to over a third of their questionnaire items – which is much too high. A caveat to this however is that given the “middle-way” philosophy in the East, it can generally be anticipated that central tendency responding will be higher in China than in the West.
Removing Answer Sheets with Random Responses
We employed two established methods to detect answer sheets which were being completed randomly i.e. the True Response Inconsistency (TRIN) and the Variable Response Inconsistency (VRIN) methods. Both methods are based on paired items which are highly associated in that knowing an individual’s response to one item will provide a very high level of prediction of their response to the other item. Therefore, when a person scores below a certain threshold with many paired items, we can be confident that their responses to the questionnaire have been random.
Data Analysis and Results
Test Re-Test Reliability
At Phase 2, students completed the Identity Questionnaire again about one month after the Phase 1 original completion of the questionnaire; we were able to conduct a Test-Retest analysis. This allows us to look into the stability or reliability of the questionnaire over time.
The final sample size for the test-retest was 206 after all the data cleansing procedures were conducted. Overall the vast majority of Identity scales were reliable. A small number of scales were below the benchmark of .70. However we need to be reminded that we are dealing with a translated questionnaire so we would expect some loss of reliability compared to the original questionnaire. So the original English Identity questionnaire sets the upper limits.
The original English Identity single scale test-retest coefficients ranged from .77 to .92 (based on a test-retest sample of 121). For the translated traditional Chinese questionnaire the test-retest coefficients ranged from .58 to .87. Seven of the 36 Identity questionnaire scales reported less than ideal test-retest coefficients:
• Consultative .57
• Psychological .61
• Empathy .57
• Adaptability .60
• Theoretical .62
• Rational .59
• Reflective .58
Interestingly, it might be argued that these scales are less meaningful to this student sample and different results are likely to be obtained in a business sample.
Internal Consistency Reliability
Another method to determine reliability is to look at internal consistency of each scale to see how well items within a scale correspond with one another. From this analysis we identified nine scales at a lower range of reliability coefficients than our ideal of 0.7:
• Social Presence .60
• Direct .61
• Empathy .58
• Adaptability .60
• Decisive .67
• Self Potency .53
• Self Protecting .62
• Social Desirability .63
• Reflective .43
Combining the two methods of establishing reliability it was useful to see if there were any scales that would have both low test-retest and low internal consistency reliability. The following 2 scales had lower reliabilities than ideal:
• Empathy
• Adaptability
We will be collecting more data so with more extensive use of the tool with participants who will be completing the questionnaire for non-research purposes we do expect the reliabilities to improve.
Study Results: Comparisons with UK Data
The results for this group of Hong Kong students were compared against the UK working population and also against a group of UK A Level applicants and Final Year Students for a Design & Technology course at a UK university.
The group of Hong Kong students compared to the other groups tended to be slightly lower on the following scales:
• Independent
• Critical
• Multi-Tasking
• Variety Seeking
• Determined
• Self Potency
• Positive
However, it is not possible to determine exactly why these differences are found as there are a range of variables as to how the groups differ from each other e.g. motivational aspects as the students were volunteers rather than real job applicants; age differences; cultural and educational experience differences; work experience differences.
Producing Norms & Developing the Career Focus Report
A set of Hong Kong student norms has been established (N= 421) and more data will be added to this at a later date when it becomes available.
At the same time as this research Quest Partnership also developed a new Career Focus Report for Identity and participating students were provided with a report. This new report has been developed with educational clients in mind but can be used by other clients supporting individuals with career guidance. Currently, the report can be normed against the UK working population and the Hong Kong students.
Translation into Simplified Chinese
The project then made traditional Chinese available as an online solution for clients with a view to collect on-going norms data and to work with any clients who can support with validation studies. In December 2009 the simplified Chinese version was also made available online.
If you are interested in training to use the Identity Questionnaire or if you would like to work with PsyAsia in distributing this assessment, please do get in touch with us.
Tags: Identity personality test, Identity questionnaire, Identity Self-Perception Questionnaire, Identity training, personality questionnaires, personality test distributor, personality test training, personality test training singapore, personality training Posted in BPS Level A & B Certificates, Competence in Psychometric Testing, Online Psychometric Test Systems, Psychometric Test Research, Psychometric Test Training, Reliability of Psychometric Tests | No Comments »
Friday, January 15th, 2010
Types of Bias in Psychometric Test Translation
With the demand and need for psychological tests increasing in various different cultures and countries, there has been much greater awareness regarding some of the issues that are associated with the development or adaptation of tests to be used in contexts and situations that may be different from which the test was developed for. This article focuses on one of the key aspects of translating tests, the types of bias that can occur.
When utilizing the test in a new cultural group, it is not quite as simple as directly translating the test, administering it and then comparing the results for its validity. There are a number of issues that need to be considered such as whether the area assessed with the test applies to the new culture or whether is may be biased towards that group and whether what is assessed by the test also has similar behavioral indicators? These are just some of the potential areas where bias can be found in the translation of tests and affect the validity of the test being utilized in the new context.
Van der Vijer & Hambleton (1996) differentiates between three distinct types of bias that may affect the validity of tests that have been adapted for different cultural contexts and these are construct bias, method bias and item bias.
Construct bias occurs when the construct (e.g. personality) that is measured by the test displays significant differences between the original culture for which it was developed and the new culture where it is going to be utilized. These differences can occur in the way that the construct was formulated and developed as well as in the relevant behaviors that are associated with the construct. It is critical to examine whether the underlying theory of the test is subject to construct bias and this can be examined through the studies examining the construct and its associated behaviors in the context that it will be utilized in. If there are significant differences found in these studies, it may indicative that there is construct bias. Major revisions may be required to overcome this bias. If not, the validity of the test will be affected.
Method bias refers to factors or issues related to the administration of the test that may affect the validity of the test. Examples of areas that method bias can occur include social desirability, acquiescence response styles, the conditions in which the test was conducted and the motivation of the respondents. Across cultures, there potentially can be differences that can occur in these areas and these can affect the way that the respondents answer the items in the test. This potentially may lead to differences between found that can be erroneously attributed to cultural differences when in fact, these differences are the result of differences in the administration procedures. As a result, it is threat to the validity of tests that have been adapted for use in new cultures. Test developers also not only need to focus on the adaptation of the test itself but also need to be aware of issues regarding the implementation of the test in a new context.
Item bias is another source of bias that can occur in the translation of tests and these refer to biases that occur with the items in the test. This is usually the result of either poor translation choices for items or due to culturally inappropriate translations. For example, the phrase “kick the bucket” is essentially a phrase that referring to passing away in the Western context and is commonly known by most people in that culture; unfortunately, this phrase would have no meaning for people from cultures without any prior experience with that phrase. In this manner, a literal translation of that phrase would be a poor translation as it does not convey the correct meaning of the item. The items in the test need to be culturally equivalent, where the meaning of the items needs to be correctly translated so as to maintain the validity of the test in the new cultural context.
These are some of the biases that may occur during the translation of tests. Test developers will need to be aware of the sources of bias and take the appropriate measures to avoid these biases.
References:
Van der Vijer, F. and Hambleton, R. K. (1996). Translating tests: some practical guidelines. European Psychologist, 1, 89-99.
Psychometric Training in Singapore, Hong Kong, Malaysia, and China
If you are serious about using psychometric tests properly then we recommend joining PsyAsia International’s Psychometric Assessment at Work Course which leads to a certificate of competence in Occupational Testing Level A and Level B from the British Psychological Society. The Course is run publically in Singapore and Hong Kong or in-house anywhere.
More details about BPS Level A and B in Singapore and Hong Kong
Online Psychometric Training – Worldwide
Alternatively, you might be interested in introductory Online Psychometric Test Training presented live by a registered psychologist. PsyAsia is offering a special fee of just US$12 for anybody who registers for the February online psychometric training course!
More details about online psychometric test training
Tags: bps certificates of competence singapore, bps level a hong kong, bps level a singapore, bps level b hong kong, bps level b singapore, choosing psychometric tests, level a occupational testing singapore, level b occupational testing singapore, personality assessment, Personality Tests, Reliability of Psychometric Tests, using psychometric test results Posted in BPS Level A & B Certificates, Competence in Psychometric Testing, Error in Psychometric Tests, Personality Tests, Psychometric Test Training, Psychometric Tests, Reliability of Psychometric Tests, Validity of Psychometric Tests | No Comments »
Friday, November 20th, 2009
The Market for Psychometrics in Singapore
There are so many Psychometric Tests on the market in Singapore now, the task of choosing the right one is not easy. Choice is always a good thing, however as humans we often look for easy or stereotypical ways of making those choices and they are not always the best ones to make. For example, a client of ours was preparing for an upcoming team-building session. He approached us asking if we had a certain test that he could use in that session. Our answer was that we don’t supply that test for various very good reasons. The client’s response was “but so many people use it”. This is a typical response. Another potential client had been looking around in Singapore for Psychometric Personality Tests to use in his training sessions as an added benefit. He categorically advised us that he was not interested in validity and was looking for something simple and cheap! The reality here is that at best he is wasting his time and the time of those who will complete his tests. At worst and most likely, his trainees will be led to believe things about themselves which frankly may not be true (reliable or valid!).
Science, Psychology, Psychometrics and the Real World of Business
As busy professionals we often assume that if lots of other people are using a test it must be a good one. This is a huge mistake. Our evolution has programmed us to be seduced by glossy advertising materials and confident, friendly salespeople. On the other hand, we have a tendency to be turned off by less glossy scientific figures, statistics and perhaps psychologists such as myself who speak about the science and real value behind a test, its validity! Ultimately then, both our clients and ourselves as psychologists have problems to overcome!!
Psychologists have to be able to explain in more “glossy” terms about the technical properties of a test and our clients, usually the HR and aligned professions, are invited to turn their ears our way for a little while, just long enough to get the notion that there is more to a psychometric test than meets the eye!
Technical Properties of Psychometric Tests
When we talk of the technical properties of a psychometric test, we are referring to things such as its reliability and validity as well as how it was constructed. If a test is constructed well, it will take time. Not months, often years. The test will also evolve over time such that more and validity data will be added to its manuals. This process is costly, hence good tests cost money.
If you come across cheap tests, that should start to ring alarm bells. It’s possible to write a few questions on a napkin in a restaurant and call it psychometric and even try to sell it. If it looks good and the questions look relevant perhaps it will sell and gain a huge following. But how reliable is that test?
In other words, can it provide consistent measurement of your candidate? If your bathroom scales provide different results each time you weight yourself you take them back and say these are not reliable. Likewise with a test, you need to ensure that it is consistently assessing the constructs that it purports to assess. We often come across new clients who are shocked when we tell them that good personality tests often contain around 200 questions. However, buyer beware! We know that the longer the test, the more reliable the results (as long as it is not so long that the candidate falls asleep!).
An unreliable test can not be a valid test, hence reliability is a precursor to validity. However, validity is arguably the most important aspect of a test. You choose to use tests because you want them to illustrate where a candidate stands in terms of their ability or personality or in order to predict how your candidate will perform or behave in a job. The test’s ability to meet this need is referred to as validity.
Some tests on the market are simply more valid that others. In fact, one test in the past year has proven to be more valid than all other tests it was compared with on the market! How come users stay with their current test then? Perhaps because of preference, habit, price, mass-following and so on. However, do ask yourself and your test supplier, how valid is your test – this is the single most important technical property in a psychometric test!
Sometimes tests which are more valid will be more expensive but this makes sense. If a test took a long time to develop, was developed well and by a reputable publisher and is based on well founded theories that have been researched internationally, then surely it is worth paying the extra as such a test will provide an excellent return on investment with its strong validity.
Training to use Psychometric Tests in Singapore
Properly developed psychometric tests require proper training to be used competently. If your test supplier requires that you undergo very limited or no training, this is a reflection of the test as well as their lack of understanding of psychometrics. You need to understand the concepts referred to above, as well as error in testing and how to make decisions based on test results, let alone how to feed back results properly to candidates and decision-makers. The type of questions (i.e., forced choice versus rating scales) will also dictate how you can use the results – you need to be trained to understand this! In some parts of the world (South Africa for example), only psychologists can use psychometric tests. Whilst this is a strict rule, it has its logical basis in how easy it is for untrained professionals to use tests wrongly.
Purchasing Psychometric Tests in Singapore
You may also wish to consider where you purchase your tests from, particularly in Singapore. In recent years we have seen an influx of profiteers in the industry who seek to make money but lack any depth of understanding in psychometrics or psychology at work. This will change in time as psychology in Singapore develops. For now however, be wary of this and we suggest that you only purchase psychometric tests from fully registered organisational psychologists who have a firm grounding in personality, psychometrics and psychology at work and who are answerable to professional competence and ethics boards. Many of those selling psychometric tests in Singapore are simply not answerable to anybody in terms of their conduct or competence. You can therefore not be certain that any advice they provide is relevant, up-to-date or will work in your organisation.
There are many more things to be aware of when choosing psychometric tests in Singapore. We cannot entertain them all here due to space constraints. You may wish to look out for training courses in Psychometric Assessment such as our our Psychometric Assessment at Work training which leads to the internationally recognised British Psychological Society Level A and B Certificates of Competence in Occupational Testing. Such courses will prepare you further for choosing the right test and therein avoid costly selection and development mistakes. Look for courses run by experts in psychometrics who are based in Singapore and hence have a strong understanding of test use aligned with local culture, laws and practice.
Note: some Singapore firms will ship in overseas trainers to run psychometric training. We suggest you avoid this training reseller model given that the facilitator is based overseas and is thus likely to lack knowledge of the Singapore business/legal and cultural environment for Psychometric Testing.
This article is Copyright PsyAsia International Pte Ltd.
It was originally written for Human Resources Magazine in Singapore
A shorter version of the article appears in the magazine’s November 2009 issue
Tags: aptitude test distributor, bps level a singapore, bps level b singapore, choosing psychometric tests, level a occupational testing singapore, level b occupational testing singapore, personality test distributor, personality test training singapore, psychometric assessment singapore, psychometric test singapore, psychometric test training singapore, singapore psychometric tests Posted in BPS Level A & B Certificates, Competence in Psychometric Testing, Psychometric Test Training, Psychometric Tests, Reliability of Psychometric Tests, Validity of Psychometric Tests | Comments Off
Wednesday, October 28th, 2009
For psychometric assessments to have utility and be effective when assessing people for various purposes, the assessment has to be reliable and valid for the situation.
All personality tests are not 100% accurate and measurement errors from a variety of sources can affect the results. The length (i.e. the number of items) of the assessment affects the reliabilty of the assessment and research has demonstrated that measurement errors are smaller in longer assessments than in shorter assessments. In addition, a larger number of items better represents the abstract characteristics that are being assessed. For example, when assessing personality, one cannot expect to obtain an accurate picture of an individual through a few questions, therefore more items are needed. It has to be noted that after a limit, increasing the number of items will not provide further increases to reliability as other factors such as fatigue will set in.
It is for this reason that good personality assessments will have a large number of items and therefore require some time for the candidates to complete the assessment (usually between 200-250 questions, taking around 30-40 minutes). Psychometric assessments that are shorter will tend to be less reliable and valid. With a large number of items, the reliability of the test will be better and in turn the validity of the assessment will be better too. Validity is all about predicting performance. So with high validity human resource professionals get a higher return on their investment.
Tags: choosing psychometric tests, hong kong psychometric tests, Personality Tests, psychometric assessment blog, psychometric personality tests, psychometric research, psychometric tests and error, singapore psychometric tests, standard error measurement Posted in Competence in Psychometric Testing, Personality Tests, Psychometric Tests, Reliability of Psychometric Tests, Validity of Psychometric Tests | Comments Off
Monday, August 24th, 2009
The first thing to remember is that if you are using a purely ipsative personality test then you should not be comparing test results between candidates. Ipsative tests are self-referencing – they are comprised of force-choice items. They are useful in coaching, team-building and career guidance, but should not be used alone in recruitment and selection scenarios.
Some tests on the market, such as the Saville Consulting Wave or the Apollo Profile are joint normative-ipsative tests and these would be fine to be used to compare between candidates. A normative test is one which allows the candidate to respond based on the strength of their agreement or disagreement with a statement. The end results are then compared with a group of similar others who have previously taken the test (the norm group).
Purely normative tests such as the Identity Self-Perception Questionnaire would also be good to use for comparing candidates. Aptitude tests are by their nature normative tests and hence can be used to compare between candidates.
So, let’s assume that we have administered a normative personality assessment to two candidates and we are particularly interested in finding a candidate with a high tendency towards creative thinking. We have decided to use a personality assessment alongside other means of assessment including an abstract reasoning test to assess this. We ask Lee and Jane to complete both of these tests. These are their scores on the test scale of interest (presented in sten scores):
Lee
Creative thinking:8
Jane
Creative thinking:6
Now, keeping in mind that we would never use test results on their own to make a decision, let’s look at how most decision-makers would approach the above scenario based on test results alone for simplicity.
It obviously appears that Lee is somewhat better suited to the position than Jane.
However, in psychometric testing just as in any assessment procedure undertaken for Human Resources, there is always a chance of error. In fact, it’s more than chance! We know that error is always present.
When interviewing somebody the error is present, when running an assessment center the error is also present. Likewise, error is also present in the use of psychometric tests. Given a desire to be scientific, reputable test publishers will actually assess their tests for error.
One way of doing this is to ask a group of respondents to complete the test today and to invite them back a month later to complete the same test. Ignoring practice effects (which are controlled for), the expectation is that there should be a strong relationship between how a candidate scored at time one and how they score at time two. The idea is that test results should remain consistent over time. Psychometricians refer to this as test-retest reliability.
We hope for high test-retest reliability and we really should be choosing tests which have proven high levels. If we don’t we will have little confidence in test results and be very limited in terms of how we use them.
The assessment for error that shows us how much confidence we can have in test scores is referred to as the standard error of measurement (SEM). It uses an equation to ascertain how confident we can be that a candidate’s test result is a reflection of their true score as opposed to their true score PLUS error.
The equation is very simple, it is just: Standard Deviation multiplied by the square root of 1 minus the test-retest reliability of the assessment. If you don’t like statistics, sorry – they really are necessary to use tests competently!
If you choose a reputable test, often the publisher will quote the SEM in the test manual. If not, you can use the equation above to calculate it. You would use the standard deviation for your scale of interest taken from the manual alongside the test-retest from the manual (note…if your publisher fails to provide these figures you should probably not be using their tests!!).
The point is that the lower the SEM (or the higher the test-retest reliability), the better. Why?
Going back to Lee and Jane above. If our test has an SEM of 1.5 STENS, this would mean that we are 68% confident that Lee’s true score for the creative thinking scale is between 6.5 and 9.5 (we add and subtract the SEM from the observed score). It would also mean that we are 68% certain that Jane’s true score lies between 4.5 and 7.5 on the same test.
Now we can see that some doubt begins to arise as to whether the differences observed between the two candidates is as a result of a real score difference or an error difference (i.e., the true score for both candidates could be 7!). We don’t want to make a mistake and choose the wrong candidate, so let’s now look at how we can compare the differences.
We can take this further and calculate something called the standard error of difference. This tells us how confident we can be that there is a true difference between the scores of the two candidates. Because both candidates completed the same test, we use the following equation: SEdiff= the square root of (1.414 * SEM squared of the test in question).
Let’s say that our test has an SEM of 1.5 STENS. Using the SEdiff equation, we get a figure of 3.18 for the SEdiff. This represents our “critical figure”. It means that the difference between the candidate’s scores must be at least 3.18 before we can conclude there is a true score difference.
In our example, the difference between the candidate’s scores is only 2. Hence we cannot conclude there is a true score difference. The implication for selection is that we should not (everything else being equal) select one candidate over the other because, although we observe differences, the differences may not be true differences, they may be simply error differences.
Note that if we choose a more reliable test it will reduce the SEM. So for example, if we have an SEM of 1 STEN, our SEdiff for the above example would be 1.19. In this case, since the difference between the candidate’s scores is 2 STENS, we could conclude that there is a true difference. We would be at least 68% certain and almost 96% certain. We won’t go into degrees of certainty in this article, but the point is made!
In summary, do not compare candidate’s test results without a knowledge of the test’s reliability and standard deviation or in other words, do not ignore the SEM. Every assessment technique has an error variable. Competent users of psychometric tests will be aware of this and ensure they do not make the wrong selection decision or give incorrect development/careers advice on the basis on error rather than true score differences.
This article is (C) 2009 PsyAsia International. Some websites have been given permission to post this article. The article must always contain our copyright, publisher details and a live link to our website. Please do not violate these terms.
Tags: choosing psychometric tests, psychometric tests and error, Reliability of Psychometric Tests, standard error measurement, standard error of difference, using psychometric test results Posted in Error in Psychometric Tests, Psychometric Test Training, Psychometric Tests, Reliability of Psychometric Tests | Comments Off
|
|
|
 |
|
 |
|