6 ways a data warehouse can supercharge your medical school

Big data, machine learning, data warehousing, predictive analytics – you likely heard these terms the last few years as a medical education practitioner. It’s easy to get caught up in the hype of these data buzzwords, but how do you sort out the style for substance when it comes to how data warehousing can help medical education? In plain terms, data warehousing refers to the concept of centralizing data from a multitude of places into one place for better reporting. Advances in technology have significantly reduced storage and processing costs, making data warehousing an appealing solution for anyone with large amounts of scattered data and no centralized place to analyze it.

It’s not hard to imagine how medical schools could benefit from data warehousing. Valuable data is constantly being generated on students, residents, curricula, rotations, and faculty. But how well do you have a handle on what this data is telling you? If you’re like the average medical school, this data is stored in a multitude of places like your admissions, learning management, scheduling, student information, evaluation, and examination systems. On top of this is key data generated from external entities such as the NBME (e.g., Step scores) and the AAMC (e.g., GQ responses). Warehouses like Acuity Analytics enable schools to better use all of this data to make positive changes at every level of medical education. 

Here are six ways that centralized data provides schools with the insight they need to train exceptional doctors and become tomorrow’s educational leaders:

1. Unlock latent value in your data

On average, less than 50 percent of data winds up informing decisions across industries in the United States. Medical schools are no exception. Unsettling as it is, it’s likely that the effort spent collecting performance data in third-party proprietary systems and spreadsheets isn’t yielding a positive return on investment without centralizing that data, making it much more effective to report on. 

Decentralized data living in separate systems is subject to uneven reporting capabilities and data silos. This leaves critical insights buried deep within systems that you may not have access to, or without the ability to report on easily. This is an often overlooked aspect as combining data from across systems is often so time intensive or expensive, inter-system data analysis is rarely done, so the benefits of it are unclear. Warehousing makes the sum of your data greater than its parts by enabling analysis of important scenarios like:

  • Am I testing what I’m teaching (in curriculum mapping or LMS system)? Where am I over/under testing? (examination system data paired with curriculum mapping or LMS system data)
  • Are a course’s methods of teaching effective at achieving desired cohort test results? (curriculum mapping or LMS system data paired with examination system data)?
  • How does this student’s MCAT score relate to their performance in high-stakes exams this year? (admissions system data paired with examination system data)
  • Exploring performance of different groups of students based on admissions/demographic attributes (examination system data paired with admissions system data).

2. Answer questions in a timely manner

Having data live in multiple, siloed systems makes answering questions about your medical school tedious and takes a long time. When questions inevitably pop up in meetings, it usually takes days or weeks before getting the answer you really needed at the time of the meeting. The point may be moot once you get the answer, or worse – you may not even bother asking the question knowing the time and effort to get the answer isn’t worth it. With a warehouse, you can answer many questions instantly, and build reusable reports for more complex questions that don’t have readily available answers. This may make the difference between catching important issues before they snowball, and having to retroactively address issues after it’s too late. The ease of answering questions also means more people can interact directly with the data, and more complex questions can be answered more regularly. 

Imagine being able to get a comprehensive overview of how your courses and clerkships are performing each week, quickly identifying which students are struggling in a meeting on the fly, or having key performance indicators (KPIs) that you or your team monitor on a daily or weekly basis with just a glance.

3. Proactively notify you of issues not on your radar

Shifting from reactive to proactive decision making is made much more possible with a data warehouse. The open nature of a warehouse as well as its standardized and cleansed data lends itself to discovering hidden correlations on performance issues. In fact, with the right skills and resources, you can use machine learning to find predictive indicators of performance at your institution. Many out-of-the-box tools can let you mine this data to find patterns you may not otherwise have been aware of. Consider the case where predictive analytics on your warehouse data proactively identifies subgroups of students that struggle in key areas, as well as what remedies have been most successful in helping students like them in the past. 

4. Free up your staff to do more valuable work

The process of collecting, collating, cleansing, analyzing, and presenting data is time consuming, tedious, and prone to human error. On average, across industries it is reported that 80 percent of data analysis is spent simply finding and preparing the data. Many of our customers report requiring the equivalent of 1.5 full-time employees to do these tasks before switching to Acuity Analytics. This “data wrangling” is oftentimes not the primary responsibility of key employees performing this work. The demands of all this leads to burnout and low job satisfaction by your staff, especially during key reporting times (end of academic year, LCME site visits). A warehouse can significantly alleviate this overhead by doing the heavy lifting of centralizing this data. In most cases it can perform more quickly and consistently than human effort. In turn, your staff can be freed up to focus on more important tasks like interpreting this data, not trying to get it under control.

5. Help democratize data access

Data’s value is often diminished by the fact those that could benefit from it never see it. Data silos persist at many schools, which leads to difficulty in sharing this data across offices in a timely fashion. It also means one silo may have a different granularity of data (like historical changes to a student’s grade) that another office may not have. Data warehouses provide the centralization to provide a consistent, timely view to those that need the data. While security access controls need to be implemented on top of this data, it becomes a much easier task sharing this uniform data across your institution. Keeping data accessible to those that are in a prime position to affect change on it now becomes a reality. For instance, schools could automatically surface more granular data to students looking to understand where their strengths and opportunities lie – empowering them to course-correct in areas they may not have previously realized were under-performing.

6. Improve data interpretation and accuracy

The same data silos that prevent people from sharing data also create headaches when trying to combine this data into a single data set from across sources due to inconsistently-applied conventions naming conventions and errors propagated in different systems. Some common examples we see include:

  • A field representing a student’s first name may be “firstname” in one system and “f_name” in another
  • The unique identifier for a student may be “123456” in one system, “doej20” in another, and “abcdef456789” in a third system
  • A student grade in one system might be scaled, whereas in another system it might not be

These examples are just a few that illustrate how hard it is to properly make sense of data manually from multiple sources. A data warehouse and its data pipeline do away with this uncertainty and can dramatically improve your data interpretation and accuracy by consistently applying the same conventions and logic to data coming in from vastly different systems. Back to the examples above, using a warehouse means only having one field name used to represent a student’s first name, only one identifier scheme used for a student, and a properly labelled or curated set of grades making it clear which are scaled and which aren’t.

Pulling all of this data together into one place also surfaces issues with the data that might need fixing in the source systems. These might be issues that are not very clear or apparent due to the data being kept separate from similar data in different systems. This is most often found when details about a student, course, or faculty member are different in one system compared to another. By combining this data into a single warehouse, a “source of truth” system can be identified and used to help correct data quality issues in other systems.


Data warehousing holds the promise of unlocking significantly more value in your medical school’s education data by making it into timely, accurate, and detailed information that you can act on. This value is critical not only to day-to-day operations, but also for measuring progress against strategic initiatives that help your school deliver on its mission. By making your data work harder for you, you’ll gain the insight necessary to optimize and individualize the training outcomes of your doctors-in-training. Data warehousing is, ultimately, an important tool in enabling precision medical education at your institution.

Interested in having a data warehouse, but don’t know whether to build or buy? Check out these six factors to consider.

To learn more about Acuity Analytics’ turn-key data warehousing capabilities, request a demo.