If you weren’t able to attend the conference, we have just the solution for you! Purchase an Exclusive Digital Pass to view Keynote Presentations, Featured Sessions, and more all from the comfort of your own home or office.
If you joined us in Scottsdale, you may also gain access to the recorded sessions. For those who attended the conference, it is complimentary for ATP members, and $200 for non-members. For those who did not attend, the cost is $250 for ATP members and $300 for non-members.Purchase your Exclusive Digital Pass today!
The Innovations in Testing Exclusive Digital Pass will include recordings of the sessions below:
Presented by: Sandy Speicher, IDEO
What would it look like to move from a measurement-centered approach to a human-centered one? Sandy Speicher, a partner and managing director of the education practice at the renowned global design firm IDEO, will introduce the concept of "design thinking" - a human-centered approach to innovation that draws from the designer's toolkit to integrate the needs of people, the possibilities of technology, and requirements for success. She will inspire us with examples of how design thinking is being used to innovate in business, healthcare, government, and yes, even education.
The ATP Innovations in Testing Conference is well-known for its high-quality content and for attracting leaders in the assessment industry. This year, ATP is launching the ATP Innovation Lab – a new forum designed to bring to light inventors and entrepreneurs whose technology, products, or services could be “game-changers” for the industry.
The Innovation Lab Participants receive one-on-one coaching from industry mentors who can provide business and industry basics, as well as guidance on networking opportunities. Participants also have access to a presentation coach to assist them in developing their stage pitch and honing their presentation skills.
The ATP Innovation Lab will culminate in a judged session on the Innovations in Testing main stage, where our participants will present their innovations, receive feedback from judges and audience members, and vie for awards.
Presented by: George McCloskey, Philadelphia College of Osteopathic Medicine
This presentation will compare and contrast the psychological constructs of intelligence and executive functions and explore how schools of thought about these two constructs have evolved over a period of nearly one hundred years.
Presented by: Rebecca Lipner, American Board of Internal Medicine; and Bradley Brossman, American Board of Internal Medicine
In the spirit of information transparency and quality improvement, a redesign of the traditional score report was undertaken to deliver more meaningful and detailed feedback to physicians taking high-stakes medical certification examinations. A new score report was produced following many months of measurement research and input from the physician community through focus groups, “think-aloud” usability interviews, and surveys with randomly-selected examinees.
Based on the initial focus groups and usability studies, the redesign tackled several issues including a simple design, a graphic display of information, meaningful content subscores, and detailed information that would help examinees better understand their performance gaps. The report follows an inverted-pyramid style, presenting broadest information first followed by more detail in each subsequent section (i.e., pass-fail decision first, followed by exam score, subscores, and descriptions of questions missed).
The simple graphical displays made it easier to understand where the examinee stood compared with the passing score and with other physicians. The measurement research led to an improved method of reporting subscores that corrects exaggerated estimates of ability that can sometimes occur when content areas contain only a small number of questions. A listing of blueprint descriptors for each question missed, along with the medical task of that question was used as a way to provide more detailed information without sacrificing test security.
This session will describe the changes made and the process used to make them.
Presented by: Dave Winsborough, Hogan Assessment Systems
For decades, the bulk of traditional commercial assessment has involved participants responding to carefully researched items that are aggregated into scales. Arguments about test development, item response formats, statistical properties of items, modes of administration, and other questions may be useful for the academic community, but have changed the applied assessment industry very little over the last 40 years. Criterion measurement has fared about the same, and has often delivered even less in terms of advancing the industry. In many ways, moving test administration from test booklets and scantron sheets to online and adaptive testing simply shifted forms onto screens and sped up the calculation of scale scores.
On the other hand, digitization has created a fundamentally different testing landscape by radically converting manual, offline processes into online, networked, computer supported (and often dependent) processes. Entire organizations are undertaking digitization initiatives, and increasingly our individual lives are taking place in digitized environments as well. These significant shifts bear important implications for the testing industry.
Specifically, these changes have enabled four significant forces that disrupt traditional assessment. First, traditional assessment items may become less and less relevant over time—or perhaps eventually disappear—as useful behavioral signals are increasingly sampled from digitized human behavior. Voice recognition software, video-based interviewing, and other sources of digital summary data (e.g., geolocation, browser use, online response latencies, and email content analysis) are just a few examples. Second, some data scientists are putting aside theory development in favor of simply jumping into the data, searching vast pools of digital content in search of relationships between variables. Third, these disruptions have also transformed our notions of outcome variables from traditional measures to aggregated digital data sources such as financial transactions and physiological information. Lastly, and most crucially, testing is either disappearing or being transformed in the form of gamification and other forms of entertainment-based testing and customers prefer it.
In short, disruption is already occurring and testing is being commodified. Given the choice between being disruptors or being disrupted, this session seeks to discuss which kinds of response should be taken.
Presented by: Mark Gierl, University of Alberta; and Andre De Champlain, Medical Council of Canada
On-demand testing is commonplace with most large-scale testing programs because it affords greater flexibility in session scheduling as well as selection of a testing location for the candidate. This does impose challenges to programs, however, including overexposure of items due to the high frequency at which exams are administered. Robust item banks—usually predicated on an increase in committee-based item writing efforts—are needed to support routine retirement and replenishment of items.
The Medical Council of Canada (MCC) has been exploring an item development process that might streamline costly traditional approaches while yielding a number of items necessary to support more frequent and flexible assessment. Specifically, the use of automated item generation (AIG)—which uses computer technology to generate test items from cognitive models—has been studied for over five years.
Cognitive models are representations of the knowledge and skills that are required to solve any given problem, and while developing a cognitive model for a medical scenario, for example, content experts would be asked to deconstruct the (clinical) reasoning process involved via clearly stated variables and related elements. Those would then be entered into a computer program that uses algorithms to generate multiple-choice questions, or MCQs (Gierl & Lai 2013).
The MCC has been piloting AIG items for over five years with a number of its examinations, including the MCC Qualifying Examination Part I (MCCQE I)—one of the requirements for medical licensure in Canada. The aim of this session will be to provide an overview of the lessons learned in the use and operational rollout of AIG with the MCCQE I.
AIG has proved beneficial from a number of perspectives in that it has: (1) offered a highly efficient process through which hundreds of MCQs can be generated from cognitive maps, (2) yielded items of a quality level that is at least equal—and in many instances superior—to that of traditionally written MCQs, based on difficulty and discrimination inclusion criteria, (3) provided a framework for the systematic creation of plausible distractors, adding value from the perspective of tailoring diagnostic feedback for remedial purposes, and (4) contributed to an enhancement of test development process.
This session’s presenters are hopeful that sharing their experiences might not only help other testing organizations interested in adopting AIG, but also foster discussion that will benefit all attendees.
Presented by: Ramesh Nava, Prometric; and Dennis Whitney, Institute of Management Accountants
It is commonplace to buy products manufactured in other countries; outsource services to different regions, and attract talent best suited to a job from all over the world. Two key concepts for globalization of business are standardization and customization.
Despite this shift in becoming more globally-focused, we are still separated by barriers which may only be overcome by tapping into local knowledge, experience and exposure. This session will focus on how test sponsors can expand their business into emerging markets like China, India, and the Middle East. Once the decision is made to venture into new markets, we must adapt to reflect local operating regulations and must determine: who are the right people to connect with, how robust are the data security laws, who is our target audience in these countries and how do we reach them, and who can we rely on as potential partners in the region?
As a test program expands internationally, they are usually multi-lingual and cross-cultural. In such a scenario, test programs are faced with the challenge of ensuring measurement equivalence of their tests to make valid international comparisons. For the tests to be valid, measurement equivalence is an absolute requirement.
Summary: In this session, we will focus on providing the attendees a roadmap for International Expansion and talk about the pitfalls of choosing the wrong partner to implement your strategy for expansion. At IMA we have found out that a stable partnership is necessary to make inroads into international markets. The impeccable standards that we have set are valued by the entire industry and the candidates are differentiated based on merit. Through the years, we have learned how to customize the test delivery channels according to the needs of the country and client while keeping the test content safe. We have proven our capabilities in developing the content and delivering the valid and reliable tests in two languages, while increasing exam volumes dramatically.
Presented by: Ashok Sarathy, GMAC; Jarret Dyer, National College Testing Association; Nikki Eatchel, Scantron Corporation; Wayne Camara, ACT; Denny Way, The College Board; Grady Barnhill, National Commission on Certification of Physician Assistants; and April Cantwell, FurstPerson
Negative media coverage, the opt-out movement, and recertification challenges are just some of the areas in which people have begun to question the value of standardized testing. How are various industry associations responding to these challenges? How are they identifying key stakeholders, engaging them, and defending the value of testing? What more can be done to address these issues within, across, and outside of our industry? A panel of leaders from several industry associations will discuss both the challenges to testing and the efforts they are taking to positively lead the conversations concerning the value of assessments.
Presented by: Quinn Sutton, Alpine Testing Solutions; Stephanie Dille, Pearson VUE; and Amy Riker, Educational Testing Service
If you build it, they will come - NOT! Even the best programs with the best assessments are not guaranteed success. Effective communication and promotion are essential elements of success. During this session you will learn industry best practices in marketing communications and social media. You will discover what others have done and what you can do to promote your program and leverage the newest resources from ATP...and build the value of testing in the industry and your market.
Presented by: Catherine Taylor, Measured Progress
ESSA provides states with an option to develop alternatives to summative testing for accountability purposes. Specifically, these alternatives may "involve multiple up-to-date measures of student academic achievement, including measures that assess higher-order thinking skills and understanding, which may include measures of student academic growth and may be partially delivered in the form of portfolios, projects, or extended performance tasks."
Assessment specialists are actively considering ways to achieve this ESSA provision. States are also considering how to incorporate classroom-based performance assessments into their accountability programs. Past efforts to collect work over time for summative assessment programs have been criticized for their unwieldiness, their lack of evidence for reliability and validity, lack of comparability across different sources, and the difficulty in applying common rubrics to collections of student work from different schools and districts.
This session will present a model for developing portfolios or collections of student evidence that addresses past criticisms. The model includes: (1) developing common scoring rubrics that can be applied across collections of work, (2) developing task shells for performance tasks that can be used to generate multiple comparable tasks that are anchored in classroom contexts, and (3) setting criteria for acceptable numbers and types of evidence.
This session will demonstrate: (1) how this model was applied in a state for a high school graduation portfolio, (2) how the collections were scored, and (3) a standard-setting method that was used to set performance standards comparable to those of the summative assessment.
Assessment providers can support states in implementing similar models by: (1) developing model performance tasks or task shells, (2) designing other generalizable classroom-based tools such as test maps for end of unit tests, and (3) using electronic portfolio methods to collect and score students' collections.
Lastly, this session will also present a summary of the evidence for validity and reliability obtained from this alternate program.
Presented by: Alex Tong, ATA; and Naotomi Umezawa, Global Communication and Testing
Asia Testing market has always been perceived as a large and homogeneous market to the rest of the world, the reality is surprisingly diverse and depending which country you have gained your experience, it could be a very technology-driven and sophisticated test market or it could well be a lag behind market needing help. Due to cultural and economic differences, Asian markets should not be collectively thought as one testing market.
The purpose of this session is hoping to get the above message across to the testing industry and having individual speakers representing their region in speaking about their challenges and opportunities as an elaboration of how different or similar these markets can be.
Presented by: Emily Fedeles, Baker Hostetler Law Firm; and Melinda McLellan, Baker Hostetler Law Firm
Since the Safe Harbor framework was invalidated in October 2015, multinational organizations have been scrambling to find alternative legal mechanisms for moving employee and customer data from Europe to the U.S. On July 12, 2016, the European Commission formally adopted the EU-U.S. Privacy Shield framework to meet this need, and beginning on August 1, 2016, companies will be able to self-certify as members of the Privacy Shield through the Commerce Department’s website. This webinar will explore the changing legal landscape surrounding cross-border data transfers, describe key features of the Privacy Shield framework for companies looking to certify, and provide an overview of the enforcement risks for companies that handle EU data.
In this webinar, Baker Hostetler Counsel, Melinda McLellan, and Associate, Emily Fedeles, will walk through what this change means for the software industry and your company. You will walk away with:
Presented by: Robert McHenry, Independent Consultant; and Krystyna Zaluski, Cambridge Cognition
The ready availability of trackers, smart watches, fitness bands, body worn devices, and clothing with sensors woven into it is providing access to biometric and neurophysiological data about individuals on a continuous and longitudinal basis. Some of this data provides direct information about people’s habits and lifestyles. Other data may be interpreted indirectly to measure personality. Some devices such as the smart watch can even be used in conjunction with the smart phone to assess intelligence or to predict the early onset of Alzheimer’s disease. In recent developments, some employers are asking employees to share data from these devices 24/7 and are creating programs for monitoring employees in and out of the workplace in order to assess employees’ psychological state, current productivity, and to predict their behavior at work.
This session will demonstrate the range of consumer and professional wearable devices currently available for purchase by consumers and professionals. It will examine the current outputs from these products (EEG, heart rate, skin conductivity, skin temperature, respiration, skin glucose, muscle mass, etc.) and consider their relevance not only to work behaviors but also well being and safety. Using a selection of case studies, this session will demonstrate how wearables can be used in test development and how data from wearables could—even more than at present—benefit both the wearer and the professional who is monitoring the output. This session will also argue that output from wearables could be used in place of questionnaires for the assessment in the clinical, educational, and occupational fields.
Presented by: Rob Pedigo, Castle Worldwide; Roy Swift, Workcred; Bob Mahlman, The Ohio State University; and Rachel Schoenig, Cornerstone Strategies, LLC
Workforce skills credentials - from industry specific credentials to badges, micro-badges and more - are continuing to grow in importance across the globe. According to the Connecting Credentials initiative, the last 30 years have seen an increase of more than 800% in the number of certificates awarded by higher education and other education and training providers. One method for addressing the proliferation and fragmentation of workforce skills credentials has been to develop a framework for credentials. For example, in the EU, the European Commission has developed the European Qualifications Framework for lifelong learning. In the U.S., the Connecting Credentials initiative has developed a credentials framework designed to help employers, potential employees and other stakeholders compare different types of credentials. ATP’s Workforce Skills Credentialing Division is monitoring and providing input to these efforts to ensure both assessment quality and security are taken into account when building a credentialing framework. Join a panel of experts as they discuss the current state of workforce skills credentialing, the ongoing development of credentialing frameworks, and ATP’s involvement in shaping these important initiatives.
Presented by: Manny Straehle, Assessment, Education, and Research Experts
Over the past few years, this session’s presenter has conducted a number of evaluations for various credentialing programs in order to determine whether they meet validity and fairness standards. Credentialing organizations often evaluate their programs to determine their readiness for meeting accreditation standards such as ISO/IEC 17024:2012 or NCCA’s Standards for the Accreditation of Certification Program, and while these accreditation standards are useful, the accreditation standards are often general and vague leaving credentialing organizations to determine the detailed methodologies and activities to meet these standards.
Consequently, these credentialing organizations frequently focus on meeting these standards while unintentionally ignoring other resources such as evidence-based materials (Handbook of Test Development) and other testing and research standards (The Standards for Educational and Psychological Testing, AAPOR’s Standards Definition, NCES Statistical Standards). Therefore—in an attempt to meet these accreditation standards (ISO and NCCA) and ignoring other resources related to improving a validity and fairness claim—credentialing organizations often share many common threats to the validity and fairness.
In this session, the presenter will discuss their own experience evaluating these programs and discuss the common threats to validity and fairness across the psychometric/test development lifecycle (including maintenance) and program and management processes of credentialing programs. Examples of the common threats will include: (1) SME representation, (2) the required number of SMEs for various activities, (3) defining a detailed scope, (4) lack of survey based job analysis, and (4) lack of policies such as security and appeals policies. Attendees will learn about these common validation and fairness threats to their own program and learn lessons in how to potentially strengthen their validity and the fairness claim of their program.
Presented by: Beth Kalinowski, PSI Services
Market research provides decision makers with information on the effectiveness of an organization’s current state, while also providing insights into potential issues. Market research can be used for decision making and developing long-term plans as well. The ultimate goal of any market research, however, is to create products and services that satisfy customer demand. Many credentialing organizations have extensive market research without even realizing it, and this data can be used to develop strategic initiatives, improve customer satisfaction, and ensure that value messaging is being heard.
Presented by: Douglas Whatley, BreakAway Games
Games are a hot topic in assessment and there is a significant amount of confusion about exactly what a game 'is'. There are many types of gameplay - action, puzzles, trivia, casual - and each type has its own uses in assessment. Open World games are a type of game where the virtual game world is open to the player to explore as they desire. There may be story elements motivating them to do certain actions, but they are free to enjoy the game in any way they prefer. Within this open sandbox there can be mini-games or other simulation elements. How can this large open world be used for assessment and/or to scaffold the mini-game assessment elements. This discussion will expose the listener to the range of options in these games and then talk about the science behind using this for more comprehensive assessments.
Presented by: O'Neal Hampton, Scantron Corporation, and Sue Steinkamp, Scantron Corporation
Research has shown that predictions of performance in secondary and even post-secondary education can be made using academic data starting as early as 3rd grade. Academic indicators can be used to identify low-performing students in need of academic intervention and gifted students in need of enrichment activities. Such programs utilize data gathered within primary and secondary schools to identify and track early warning indicators (e.g. grades, assessment results, attendance, and discipline) and proactively work with students based on those results. Implications of predictive analytics could include earlier and more effective adjustments to educational approaches as well as specialized programs that leverage student strengths to prepare them prepare to enter the workforce.
While educators may have access to an abundance of academic data, they often struggle to turn that data into predictive and rich analytical insight.
Questions this session seeks to ask include: (1) what are the available approaches and tools that can be employed for the most beneficial view of current and likely future performance, (2) what measures provide the best early warning indicators regarding student probability of success well before they enter high school, and (3) how can we help educational institutions benefit from the power of predictive analytics to proactively determine the likelihood of future academic performance and potential career success?
The purpose of this session is to start the discussion around these topics and help educational organizations to address these questions to ensure student success. This session will cover the following: (1) use and value of predictive analytics, including specific education examples, (2) processes to track and identify early warning indicators, and (3) recommendations for data sets with most significant value.
Presented by: Andrew Wiley, ACS Ventures
The concept of testing irregularities is one that is receiving an increased amount of attention in the testing industry. Loosely defined, testing irregularities refers to any incidents that occur that impede the ability to deliver and score tests. Some scenarios that have been encountered include (1) students being unable to log in to systems that will deliver tests, (2) distributed denial of service (DDOS) attacks that prevent tests from being delivered to test centers, and (3) students and candidates being kicked out of the testing system in the middle of taking a test. These problems have been encountered in a wide variety of settings, including states like Florida, Tennessee, and Indiana.
But even with a variety of preventive activities, testing irregularities still do—and will—occur. All too often when these testing irregularities are encountered testing centers and publishers are left scrambling and struggling to recover and restart test administration as soon as possible. Given the frequency with which these testing irregularities seem to occur, more proactive planning must be completed.
The purpose of this Ignite session will be to quickly review a set of policies and procedures that could be introduced to help plan for the event of a testing irregularity. After the session participants will be invited to join a roundtable discussion of plans specified as well as other means of proactive planning. This discussion will be a critical part of the conversation and will allow participants to brainstorm and develop new or more efficient models. The facilitator of the session will plan on taking notes and will share the outcomes from the roundtable with any attendee who is interested in seeing notes from the discussion.
Purchase your Exclusive Digital Pass today!