Psychometric Tests in Singapore, Hong Kong, Malaysia, Asia, Australia, UK. Aptitude and Personality Tests for Recruitment, Selection and Development.
Aptitude Tests | Personality Tests | Psychometric Training | 360° Appraisal
Psychometric Assessment
Psychometric Tests and Psychometric Training from Organisational Psychologists
Singapore, Hong Kong, Malaysia, China, Australia, UAE, Qatar, Saudi Arabia & UK
 
 
Search
Email Us
Call Us
Sitemap
Australian Psychological Society Psychologist
  Home     Aptitude Tests     Personality Tests     Psychometric Training     Online Psychometric Courses     360° Appraisal     Knowledgebase     Blog          Contact  
Psychometric Assessment Blog
 

Posts Tagged ‘Reliability of Psychometric Tests’

Online Psychometric Test Mini-Course: Lesson 4

Wednesday, July 28th, 2010

In this session we will explore the following:

1. The relationship between reliability and validity in psychometric assessment
2. How psychometric test administrators can impact the reliability of tests

Psychometric Test Reliability

When choosing a reputable test, whether it be aptitude or personality, one of the properties of the test you will need to look for is reliability. We’ll consider reliability in appropriate detail in a later section of the course.  For now, think of reliability as consistency.  In order to have absolute confidence in our test scores we need them to be consistent.  However, we can’t test and retest our candidates in the real world. Despite this, reputable test publishers would already have done this for you. This would have been carried out under optimal conditions.  So, now you know that you are using a reliable test (one that produces consistent scores), it’s your task as the test administrator to ensure that the test remains a reliable test.

Why is reliability so important?

Whenever you assess something, you expect the score you get to be reliable. For example, if you assess your weight using bathroom scales, you expect the reading you get to be consistent across at least the short term. If you weigh yourself over 2 consecutive days and get significantly different readings you know something is wrong with the scales!  The same is true of psychometric tests. The publisher first ensures that the test scores will be consistent over time and then you, as the administrator, need to ensure that your actions do not make the test less reliable.

Not only do we want and expect test results to remain reliable over time, but we also know that reliability is a precursor to validity. It sets an upper limit on the test’s validity. In other words, if your test is not reliable then it is not valid. Confusing?  Let’s use the weighing scales example again…

Let’s suppose a medical doctor does some research which shows that those who weight more than 120kg are significantly more likely to suffer a heart attack.  His research shows that weight is a valid indicator for predicting the heart attack.  The scales are fit for the purpose of predicting a heart attack.  Validity is all about being fit for purpose.  Now if those scales are not reliable, they will provide inconsistent data over the time of the research program.  In this case would you have confidence in the doctor’s findings? Of course not!

So, to apply this to psychometric tests let’s take an aptitude test. We’ve carried out research which confirms that a new numerical reasoning test can predict the performance of accountants. Those who score better on the test are rated as better accountants.  This is validity. The test is fit for the purpose of predicting accountant performance.  You will hopefully have full confidence in this finding if you know the test is reliable.  If however you expect the test is coming up with inconsistent scores for your candidates, it is unreliable, and, as in the scales example above, you will not have confidence in the test’s prediction of accountant performance. This is why reliability is a precursor to validity.

And why is all of this so important for this course?  It’s because you as the test administrator can enhance or reduce the reliability of the test by how you administer it in the first place.  Let’s now take a look at what factors you can and can’t influence in terms of reliability.

How psychometric test administrators can impact the reliability of tests

factors affecting psychometric test reliability
Factors Affecting Psychometric Test Reliability (C)2010 PsyAsia International: No Copying

Take a look at the graphic on the left. It shows different factors which can impact the reliability of psychometric tests. This applies to both aptitude tests and personality assessments.

Factors within the test

Generally, a test administrator is not responsible for this. The test publisher must design tests that will be highly reliable. Factors within the test means that the questions chosen must be accessible to all groups for whom the test is intended. If a subsection finds some questions difficult based on their group membership (i.e. non-native-English speaking groups may not understand a colloquialism used in a test question), then the test will be less reliable for that group. Although the publisher needs to ensure a reliable test, not all test publishers are reputable or know what they are doing! This is why the person who purchases the test needs to know how to evaluate it. We’ll show you later how to evaluate the test in greater detail.  Know for now that you do not evaluate a test or validate it by trialling it on yourself or your colleague as many untrained users think!

Factors within the respondent

Whilst the test administrator cannot control all the possible factors within a respondent, you can do your best to ensure you control for a much as possible.  It’s a good idea to think here about how you would like to be treated if you were undergoing a psychometric assessment for the first time. You’d probably like a friendly invitation letter explaining what is going to happen and why. You’d like to know that your data and results will remain confidential and only shared with decision-makers and only for the purpose that you’re undertaking the test. You’d also like to know what you need to bring with you and if possible, a few example questions as approved by the test publisher might help to set your mind at rest.  Finally it would be good to have a number to call should you have any special needs that you wish to convey to the administrators before the day.  So, when you arrive at the test centre you already know what is going to happen and why, you won’t be overly concerned, you’ll have all the right things with you (e.g., reading glasses) and you’ll know how long the session is going to last. If it’s a personality test you’ll be more likely to be open and honest because you know your results won’t go further than the selection or development committee and won’t be used for reasons beyond the reason you’ve already been given.

Ultimately here you are attempting to control for mood and expectations. Ideally you don’t want these to vary between candidates in order to give everybody the same start line.  On the actual day of the test you will go over all of these things again with the candidates in the room to ensure that they are all clear on what will happen and why.  Again, this sets the scene and mood, demonstrates your organisation’s “humanness” in the assessment process and provides candidates with an opportunity to ask questions.  Furthermore, on the day you will need to ensure that you administer the test instructions word for word and then administer the test exactly as intended by the test publisher. Doing all of this enhances consistency and thus increases reliability.  This is essential as we saw before because reliability is the precursor to validity.

Factors within the environment

How well would you be able to complete an aptitude test in a noisy room?  Or how about  room that’s freezing from too much air conditioning or too hot due to broken air conditioning?  Likewise, you need to ensure that the test environment is conducive to candidate performance each and every time.  This applies to personality assessment too. Although there is no right or wrong, your candidate will certainly feel more able to make an effort and respond accurately if you provide them with the right environment!  So, some time before the session you’ll need to check the room, make sure temperature controls work. On the day, switch them on in good time before the test so that by the time candidates arrive the room is just right.  Place a sign on the door to ensure you are not disturbed during the testing session and be sure to silence all phones in the room.  Candidates should of course have phones switched off too.  Ensure that once the session is over, all candidates leave at the same time so that they do not disturb others.  If a candidate really must make a restroom visit, they should be accompanied by an administrator and only one candidate at a time should go. Ensure that upon leaving and rejoining the room the candidate does not disturb others.
(Note: also a good idea to check there is no planned construction nearby and there are no fire drills scheduled on the day of testing. Do this before sending out your invitation to the candidate!)

Summary

By referring to these guidelines you’ll help to ensure that psychometric tests used by your organisation remain as reliable as the publisher intends them to be. By using short-cuts and not following the guidelines you’ll threaten the reliability and therefore the validity of the tests.  If you threaten a test’s validity it becomes unfit for purpose which means your company is wasting its money buying psychometric tools!

Interested in learning more about psychometric testing for HRM? Keep reading – your next free session is not far away! To ensure you don’t miss a single instalment, we suggest you follow-us on twitter as each new post will be announced there. You may also like to join our face-to-face psychometric training courses in Singapore or Hong Kong – these range from simple introductory courses through to Certification Courses such as the BPS Level A and BPS Level B Certificates of Competence in Occupational Testing. Not in Singapore or Hong Kong? No problem – we also offer both recorded and live online training in psychometrics! For full details please see here or email us.

DO NOT COPY OR SAVE THIS ARTICLE TO YOUR COMPUTER.
THIS ARTICLE IS CLEARED FOR PUBLISHING ON PSYCHOLOGY1 GROUP SITES ONLY. IT REMAINS COPYRIGHT AND INTELLECTUAL PROPERTY OF PSYASIA INTERNATIONAL PTE. LTD. YOU ARE NOT AUTHORIZED TO PUBLISH IT ON ANY OTHER SITE. YOU ARE NOT PERMITTED TO COPY/PASTE THIS ARTICLE OR TO SAVE IT TO YOUR LOCAL DRIVE. YOU ARE ONLY PERMITTED TO READ IT ONLINE AT OUR WEBSITE. VIOLATION OF THESE TERMS WILL RESULT IN BANNING OF OFFENDING IPS AND LEGAL ACTION FOR THOSE WHO REPUBLISH THIS ARTICLE WHETHER IT BE WITH OR WITHOUT A REFERENCE TO THE ORIGINAL AUTHOR.

HRM Webinar: How Chinese are the Chinese? A look at Personality Tests for the China

Wednesday, May 19th, 2010
  Join us for a Webinar on June 22
 
 
   
 
 
This is a FREE webinar in PsyAsia’s HRM themed webinar series. In this session we are pleased to present research on and answer questions about whether or not Chinese people are significantly different to other major groups and whether any potential differences are likely to impact upon the ability of personality tests to predict performance at work.Some HR people in Asia believe that culture plays such a significant role in personality that indigenous personality attributes need to be assessed at recruitment/selection. To this end, personality tests have been developed “in Chinese for the Chinese by the Chinese”.  A significant question to ask is: Do these tests add any prediction over and above that afforded by mainstream personality tests developed by world renowned experts in the field?The above questions will be answered through discussion of the trait model of personality and its biological basis. Peer-reviewed and published research conducted by PsyAsia International’s award-winning Psychologist, Dr. Graham Tyler;  award-winning Dr. Peter Newcombe of the University of Queensland; and world-renowned Professor Paul Barrett, formerly of the University of Auckland will be presented in an easy to understand format.

As always, the webinar is open to all HR and related professionals in our region. It is not open to competitors. You must provide your corporate email address when registering – we do not approve free email accounts such as yahoo/google/hotmail/rediffmail etc.

All attendees who remain for the entire session will receive a free pdf Certificate of Professional Development. Hard-copy certificates can also be requested for a fee.

 
Title:   Chinese Personality at Work – How Chinese are the Chinese?
 
Date:   Tuesday, June 22, 2010
 
Time:   5:00 PM – 6:00 PM SGT
 
After registering you will receive a confirmation email containing information about joining the Webinar.
 
System RequirementsPC-based attendeesRequired: Windows® 7, Vista, XP, 2003 Server or 2000
 
Macintosh®-based attendeesRequired: Mac OS® X 10.4.11 (Tiger®) or newer 
 

Space is limited.
Reserve your Webinar seat now at:
https://www1.gotomeeting.com/register/671216737

Types of Bias in Psychometric Test Translation

Friday, January 15th, 2010

Types of Bias in Psychometric Test Translation

With the demand and need for psychological tests increasing in various different cultures and countries, there has been much greater awareness regarding some of the issues that are associated with the development or adaptation of tests to be used in contexts and situations that may be different from which the test was developed for. This article focuses on one of the key aspects of translating tests, the types of bias that can occur.

When utilizing the test in a new cultural group, it is not quite as simple as directly translating the test, administering it and then comparing the results for its validity. There are a number of issues that need to be considered such as whether the area assessed with the test applies to the new culture or whether is may be biased towards that group and whether what is assessed by the test also has similar behavioral indicators? These are just some of the potential areas where bias can be found in the translation of tests and affect the validity of the test being utilized in the new context.

Van der Vijer & Hambleton (1996) differentiates between three distinct types of bias that may affect the validity of tests that have been adapted for different cultural contexts and these are construct biasmethod bias and item bias.

Construct bias occurs when the construct (e.g. personality) that is measured by the test displays significant differences between the original culture for which it was developed and the new culture where it is going to be utilized. These differences can occur in the way that the construct was formulated and developed as well as in the relevant behaviors that are associated with the construct. It is critical to examine whether the underlying theory of the test is subject to construct bias and this can be examined through the studies examining the construct and its associated behaviors in the context that it will be utilized in. If there are significant differences found in these studies, it may indicative that there is construct bias. Major revisions may be required to overcome this bias. If not, the validity of the test will be affected.

Method bias refers to factors or issues related to the administration of the test that may affect the validity of the test. Examples of areas that method bias can occur include social desirability, acquiescence response styles, the conditions in which the test was conducted and the motivation of the respondents. Across cultures, there potentially can be differences that can occur in these areas and these can affect the way that the respondents answer the items in the test. This potentially may lead to differences between found that can be erroneously attributed to cultural differences when in fact, these differences are the result of differences in the administration procedures. As a result, it is threat to the validity of tests that have been adapted for use in new cultures. Test developers also not only need to focus on the adaptation of the test itself but also need to be aware of issues regarding the implementation of the test in a new context.

Item bias is another source of bias that can occur in the translation of tests and these refer to biases that occur with the items in the test. This is usually the result of either poor translation choices for items or due to culturally inappropriate translations. For example, the phrase “kick the bucket” is essentially a phrase that referring to passing away in the Western context and is commonly known by most people in that culture; unfortunately, this phrase would have no meaning for people from cultures without any prior experience with that phrase. In this manner, a literal translation of that phrase would be a poor translation as it does not convey the correct meaning of the item. The items in the test need to be culturally equivalent, where the meaning of the items needs to be correctly translated so as to maintain the validity of the test in the new cultural context.

These are some of the biases that may occur during the translation of tests. Test developers will need to be aware of the sources of bias and take the appropriate measures to avoid these biases.

References:

Van der Vijer, F. and Hambleton, R. K. (1996). Translating tests: some practical guidelines. European Psychologist, 1, 89-99.

Psychometric Training in Singapore, Hong Kong, Malaysia, and China
If you are serious about using psychometric tests properly then we recommend joining PsyAsia International’s Psychometric Assessment at Work Course which leads to a certificate of competence in Occupational Testing Level A and Level B from the British Psychological Society. The Course is run publically in Singapore and Hong Kong or in-house anywhere.
More details about BPS Level A and B in Singapore and Hong Kong

Online Psychometric Training – Worldwide
Alternatively, you might be interested in introductory Online Psychometric Test Training presented live by a registered psychologist. PsyAsia is offering a special fee of just US$12 for anybody who registers for the February online psychometric training course!
More details about online psychometric test training

Comparing psychometric test results between candidates

Monday, August 24th, 2009

The first thing to remember is that if you are using a purely ipsative personality test then you should not be comparing test results between candidates.  Ipsative tests are self-referencing – they are comprised of force-choice items.  They are useful in coaching, team-building and career guidance, but should not be used alone in recruitment and selection scenarios.

Some tests on the market, such as the Saville Consulting Wave or the Apollo Profile are joint normative-ipsative tests and these would be fine to be used to compare between candidates.  A normative test is one which allows the candidate to respond based on the strength of their agreement or disagreement with a statement. The end results are then compared with a group of similar others who have previously taken the test (the norm group). 

Purely normative tests such as the Identity Self-Perception Questionnaire would also be good to use for comparing candidates.  Aptitude tests are by their nature normative tests and hence can be used to compare between candidates. 

So, let’s assume that we have administered a normative personality assessment to two candidates and we are particularly interested in finding a candidate with a high tendency towards creative thinking.  We have decided to use a personality assessment alongside other means of assessment including an abstract reasoning test to assess this.  We ask  Lee and Jane to complete both of these tests.  These are their scores on the test scale of interest (presented in sten scores):

Lee
Creative thinking:8

Jane
Creative thinking:6

Now, keeping in mind that we would never use test results on their own to make a decision, let’s look at how most decision-makers would approach the above scenario based on test results alone for simplicity.

It obviously appears that Lee is somewhat better suited to the position than Jane.

However, in psychometric testing just as in any assessment procedure undertaken for Human Resources, there is always a chance of error.  In fact, it’s more than chance!  We know that error is always present. 

When interviewing somebody the error is present, when running an assessment center the error is also present.  Likewise, error is also present in the use of psychometric tests.  Given a desire to be scientific, reputable test publishers will actually assess their tests for error. 

One way of doing this is to ask a group of respondents to complete the test today and to invite them back a month later to complete the same test.  Ignoring practice effects (which are controlled for), the expectation is that there should be a strong relationship between how a candidate scored at time one and how they score at time two.  The idea is that test results should remain consistent over time.  Psychometricians refer to this as test-retest reliability.

We hope for high test-retest reliability and we really should be choosing tests which have proven high levels.  If we don’t we will have little confidence in test results and be very limited in terms of how we use them.

The assessment for error that shows us how much confidence we can have in test scores is referred to as the standard error of measurement (SEM).  It uses an equation to ascertain how confident we can be that a candidate’s test result is a reflection of their true score as opposed to their true score PLUS error.

The equation is very simple, it is just: Standard Deviation multiplied by the square root of 1 minus the test-retest reliability of the assessment.  If you don’t like statistics, sorry – they really are necessary to use tests competently!

If you choose a reputable test, often the publisher will quote the SEM in the test manual.  If not, you can use the equation above to calculate it.  You would use the standard deviation for your scale of interest taken from the manual alongside the test-retest from the manual (note…if your publisher fails to provide these figures you should probably not be using their tests!!). 

The point is that the lower the SEM (or the higher the test-retest reliability), the better.  Why?

Going back to Lee and Jane above.  If our test has an SEM of 1.5 STENS, this would mean that we are 68% confident that Lee’s true score for the creative thinking scale is between 6.5 and 9.5 (we add and subtract the SEM from the observed score).  It would also mean that we are 68% certain that Jane’s true score lies between 4.5 and 7.5 on the same test.

Now we can see that some doubt begins to arise as to whether the differences observed between the two candidates is as a result of a real score difference or an error difference (i.e., the true score for both candidates could be 7!).  We don’t want to make a mistake and choose the wrong candidate, so let’s now look at how we can compare the differences.

We can take this further and calculate something called the standard error of difference. This tells us how confident we can be that there is a true difference between the scores of the two candidates.  Because both candidates completed the same test, we use the following equation: SEdiff= the square root of (1.414 * SEM squared of the test in question). 

Let’s say that our test has an SEM of 1.5 STENS.  Using the SEdiff equation, we get a figure of 3.18 for the SEdiff. This represents our “critical figure”. It means that the difference between the candidate’s scores must be at least 3.18 before we can conclude there is a true score difference.

In our example, the difference between the candidate’s scores is only 2.  Hence we cannot conclude there is a true score difference.  The implication for selection is that we should not (everything else being equal) select one candidate over the other because, although we observe differences, the differences may not be true differences, they may be simply error differences.

Note that if we choose a more reliable test it will reduce the SEM.  So for example, if we have an SEM of 1 STEN, our SEdiff for the above example would be 1.19.  In this case, since the difference between the candidate’s scores is 2 STENS, we could conclude that there is a true difference.  We would be at least 68% certain and almost 96% certain.  We won’t go into degrees of certainty in this article, but the point is made!

In summary, do not compare candidate’s test results without a knowledge of the test’s reliability and standard deviation or in other words, do not ignore the SEM.  Every assessment technique has an error variable.  Competent users of psychometric tests will be aware of this and ensure they do not make the wrong selection decision or give incorrect development/careers advice on the basis on error rather than true score differences. 

This article is (C) 2009 PsyAsia International. Some websites have been given permission to post this article.  The article must always contain our copyright, publisher details and a live link to our website. Please do not violate these terms.

 
 
  • Recent Posts

  • Tags

    ability and job performance Apollo Profile aptitude test distributor bps certificates of competence hong kong bps certificates of competence singapore bps level a hong kong bps level a singapore bps level b hong kong bps level b singapore choosing psychometric tests hong kong psychometric tests Identity Self-Perception Questionnaire intelligence at work Intelligence Research level a occupational testing singapore level b occupational testing singapore online psychometric course online psychometric test training personality assessment personality questionnaire personality test distributor Personality Tests personality test training singapore psychometric assessment blog psychometric assessment singapore psychometric course hong kong psychometric course singapore psychometric news psychometric personality tests psychometric research psychometric test distributor psychometric testing hong kong psychometric tests and error psychometric test singapore psychometric test training hong kong psychometric test training malaysia psychometric test training singapore psychometric training Reliability of Psychometric Tests Saville Consulting Wave saville wave singapore psychometric tests standard error measurement using psychometric test results wave personality test
  • Categories

  • Archives

  • Aptitude Tests
    Professional Aptitudes
    Work Aptitudes
    Operational Aptitudes
    Commercial Aptitudes
    Customer Aptitudes
    Administrative Aptitudes
    Practical Aptitudes
    Swift Analysis Aptitude
    Swift Comprehension Aptitude
    Swift Technical Aptitude
    Personality Tests
    Saville Consulting Wave®
    Saville Consulting Wave Professional Styles
    Saville Consulting Wave Focus Styles
    Saville Consulting Wave Types
    Wave Entrepreneurial Potential
    Saville Consulting Wave Job Profiler
    Saville Consulting Performance Card Set
    Saville Consulting Wave Performance 360
    Wave Employee Development
    Wave Performance Culture
    Identity Personality Questionnaire
    Identity For Education
    The Apollo Profile
    Apollo Select Candidate Screening System
    Psychometric Training
    Psychometric Test Administration Course (with optional BPS Certification)
    BPS Level A and B Training (British Psychological Society Certification)
    Introduction to Psychometric Tests
    Apollo Profile Accreditation Course
    Identity Questionnaire Accreditation Course
    Saville Consulting Wave Conversion Course
    Saville Consulting Wave Full Training
    Free Psychometric Mini-Series
    Apollo Profile Online Learning
    Online Learning Centre - Psychometric Assessment Category
    Dates for all Psychometric Test Training Courses
    Online Psychometric Test Systems
    Apollo Profile
    Identity Personality Questionnaire
    Saville Consulting Oasys™
    Test Administrator Network
    Bureau Scoring Service
    Psychometric Test Candidates
    Best Practice in Psychometrics
    Psychologist-on-Call™
    360° Performance Appraisal
    Knowledgebase & Journals
    Blog
    Online Learning
    Search
    Email Us
    Call Us
    Sitemap
     
      Copyright © 2001-2010 PsyAsia International Pte. Ltd. A Psychology1 Group Site. All rights reserved.
      Psychometric Tests, Aptitude Tests & Personality Tests in Singapore, Hong Kong, Malaysia, China, Pakistan,
      India, Thailand, Brunei, Dubai, UAE, Qatar, Australia, UK, USA and more!
    Follow us on Twitter   PsyAsia on Facebook   PsyAsia on YouTube   Tweet or Share this page
      Do Not Copy