Why the Smartest People in Radiology Are Paid by the Dumbest Metric
Dr. Avery J. Knapp Jr. argues that wRVU-based radiology pay ignores case complexity, time, risk, and invisible labor, then sketches a fairer AI-weighted model.
Table of Contents
- A Very Brief History of a Very Bad Idea
- Thirty Things wRVU Doesn't Know About Your Worklist
- I. The Patient in Front of You
- II. Study & Report Complexity
- III. Workflow & Environment
- IV. Downstream Work
- V. Risk & Invisible Labor
- The Private Equity Problem
- What a Real System Would Look Like
- Yes, People Will Try to Game It
- The Ask
Picture two MRIs of the lumbar spine sitting in your worklist right now.
The first is a 30-year-old CrossFit guy. New low back pain. No surgical history. No priors. Clean images. You open it, scroll through, dictate "no acute findings, mild disc desiccation at L4-L5, no significant stenosis," sign it, move on. Three minutes. Maybe four.
The second is an 80-year-old woman. Three prior laminectomies. Hardware at L3 through L5. Postoperative changes everywhere. The surgeon wants to know if there's recurrent stenosis above the fusion and whether that fluid collection is a seroma or an abscess. You pull six prior studies. You scroll through each one. You compare. You measure. You hedge your language carefully because this patient is going back to the OR if you say the wrong thing. Twenty-two minutes.
Same CPT code. Same wRVU. Same pay.
If that doesn't bother you, stop reading. This article isn't for you.
If it does bother you, if it's bothered you for years, if you've brought it up at a department meeting and been told "that's just how the system works," keep reading. Because the system is stupid. And I think we can build a better one.
Section 1 A Very Brief History of a Very Bad Idea
In 1985, Congress asked a Harvard economist named William Hsiao to figure out how to pay doctors. Not how to pay them well. Not how to pay them fairly. How to pay them in a way that could be standardized, budgeted, and controlled.
Hsiao and his team developed the Resource-Based Relative Value Scale (RBRVS). The idea was elegant in theory: assign each medical service a relative value based on the physician work involved, the practice expense, and malpractice cost. Multiply by a conversion factor. That's your payment.
CMS adopted RBRVS in 1992. It has been the foundation of physician payment in America ever since.
The values are maintained by the RUC (the AMA's Relative Value Scale Update Committee), a 31-member panel that meets three times a year. They survey physicians. They estimate time and intensity. They vote on relative values. CMS accepts roughly 90% of their recommendations and publishes the Physician Fee Schedule.
Here is the problem, and it has been the problem since day one: the RUC is a committee of mostly surgeons and proceduralists deciding what cognitive work is worth. Radiology has historically had 1 or 2 seats at that table. The people deciding what your MRI read is "worth" have probably never read an MRI.
The values are set by surveys. Self-reported surveys. Of physicians estimating how long things take. If you've ever filled out one of these surveys, you know how absurd they are. If you haven't, imagine someone asking you to estimate, on average, how many minutes you spend on a lumbar MRI, across all clinical scenarios, all patient types, all levels of complexity, and then using your answer to set national payment policy.
That's the system. A survey. Averaged. Run through a committee. Rubber-stamped by CMS. Published once a year. Governing billions of dollars in physician compensation.
"Show me the incentive and I will show you the outcome." Charlie Munger said that. The incentive of the RUC is to distribute a shrinking pie among competing specialties. This system is working as intended, but it does not benefit you. The outcome is a system that rewards volume over value, penalizes complexity, and hasn't meaningfully changed its methodology since I was in middle school.
Section 2 Thirty Things wRVU Doesn't Know About Your Worklist
I've been thinking about this for a while. I started a list of things that affect how hard a radiology study actually is to read that wRVU completely ignores. The list got long. Here are thirty of them.
Section 3 I. The Patient in Front of You
1. Care setting.
This one is so obvious it hurts. A lumbar MRI on a healthy 42-year-old outpatient with low back pain. Same CPT code on a trauma patient in the ER with a possible conus injury and the attending calling you every four minutes for an update. Same code again on an ICU patient with a history of spinal infection and the ID team and impaired renal function, the neurosurgery team, and the hospitalist all waiting on your read to decide next steps.
Three completely different jobs. Same wRVU. Identical pay.
The ER read is faster-paced, higher-stakes, and requires you to communicate findings in real time. The inpatient read requires chart review, clinical correlation, and often a phone call. The outpatient read is straightforward. Anyone who has spent a single day reading all three knows this. The RUC apparently hasn't spent that day.
2. Patient age.
A chest CT on a 25-year-old athlete (clear lungs, no lymphadenopathy, no incidental findings, done in 120 seconds) and a chest CT on an 82-year-old with COPD, bilateral pleural effusions, a 4.2cm thoracic aortic aneurysm that you need to compare with the prior from 14 months ago, three pulmonary nodules that need Lung-RADS assessment, and an incidental adrenal lesion.
Same CPT. Same wRVU. One took 2 minutes. The other took 12, plus a critical results call, plus a recommendation for follow-up CT in 3 months.
Geriatric and pediatric patients are harder. More pathology. More incidental findings requiring follow-up language. More artifacts (motion in kids, hardware in the elderly). More hedging because the clinical picture is muddier. Every radiologist knows this. The compensation system does not.
3. Clinical indication.
"Rule out fracture" on a wrist X-ray. Fifteen seconds. Normal. Done.
"Evaluate for osseous metastatic disease" on the same wrist X-ray. Now you're scrutinizing every cortical surface. You're looking for subtle lucencies, periosteal reaction, soft tissue masses. You're wondering if that faint area of bone loss is real or just overlapping structures. You're dictating a different report with different recommendations and different liability.
Same study. Same wRVU. The indication changed everything about how you read it. wRVU doesn't ask what the question was. It only knows what body part was scanned.
4. Prior imaging.
Some studies come with eight years of comparison imaging. Twelve prior CTs. Six prior MRIs. The expectation (and the standard of care) is that you review the relevant priors and comment on changes. This is real work. It takes real time. Sometimes the comparison review takes longer than the current study.
Other studies have no priors at all. First-time patient. You read what's in front of you.
Same wRVU.
The system doesn't know and doesn't care whether you spent 10 minutes reviewing priors or zero. It pays the same either way. Which creates an incentive (there's Munger again) to not review priors as carefully. Which is bad for patients. Which is the opposite of what a compensation system should incentivize.
5. Patient body habitus.
Obese patients produce degraded images. More artifact. More limited exams. More equivocal findings. More hedge language. More time.
An ultrasound on a thin patient vs. an ultrasound on a patient with BMI 45. The thin patient's gallbladder is visible in seconds. The obese patient's gallbladder may be partially obscured, requiring additional views, and the report will say "evaluation limited by body habitus."
Same CPT. Same wRVU. One took twice as long and produced a less definitive answer.
6. Cognitive effort and stakes.
A normal brain MRI. Scrolled through. No mass, no acute infarct, no hydrocephalus. Dictated and signed. Easy.
A brain MRI with a subtle area of restricted diffusion in the left posterior inferior cerebellar artery territory that you almost missed because it's three pixels and the patient presented with "dizziness." You catch it. You call the ER. The patient gets tPA. Different outcome.
Same CPT. Same wRVU. One required you to operate at the absolute limit of human pattern recognition. The other didn't.
Should those pay the same? I don't think so. I think the system should know the difference. Right now it can't.
Section 4 II. Study & Report Complexity
7. Number of series or sequences.
A lumbar MRI ordered with 5 sequences (Sag T1, sag T1 post-contrast, sag T2, ax T2, sag STIR). Standard. Quick.
The same lumbar MRI ordered by a spine surgeon who wants 12 sequences including post-contrast, axial T1, coronal STIR, and fat-suppressed post-contrast. Double the images. Literally double the scrolling and review time.
Same CPT code. Same wRVU.
The number of images in a study has increased dramatically over the last two decades. CT slice thickness has gotten thinner. MRI protocols have gotten longer. The number of images per study has exploded. wRVU values have not kept pace. This is well-documented and universally acknowledged, and nothing has been done about it.
8. Report complexity and findings density.
A CT abdomen/pelvis noncon on a healthy 35-year-old with right lower quadrant pain. Appendicitis or no appendicitis. You scroll through, find a normal appendix, dictate a clean report. Four lines. Done.
A CT abdomen/pelvis noncon on a 74-year-old oncology patient with 14 findings: hepatic steatosis, cholelithiasis, bilateral renal cysts with one Bosniak IIF requiring follow-up, sigmoid diverticulosis with mild wall thickening, bilateral inguinal hernias, degenerative changes throughout the lumbar spine, a 1.3cm adrenal nodule that needs characterization, and three hepatic lesions you need to compare to the prior from six months ago. The report is 25 lines. Each finding needs assessment and a recommendation. The impression is a paragraph.
Same CPT code. Same wRVU. One report took 3 minutes. The other took 15 and generated three follow-up recommendations.
9. Structured reporting requirements.
PI-RADS for prostate MRI. LI-RADS for liver lesions. BI-RADS for breast imaging. TI-RADS for thyroid nodules. Lung-RADS for pulmonary nodules. O-RADS for ovarian/adnexal findings.
Each of these standardized frameworks requires specific measurements, specific language, specific categorization, and specific follow-up recommendations. They take longer than free-text narrative reports. They require you to know the criteria, apply them correctly, and document them precisely.
A TI-RADS evaluation of a thyroid nodule is a different cognitive exercise than typing "thyroid appears normal." But the wRVU is determined by the CPT code, which is determined by what was imaged, not how the report was structured.
10. Multistudy encounters.
CT chest, abdomen, and pelvis. Ordered together constantly. Three CPT codes. Three separate wRVU credits. But you're scrolling through one continuous dataset. You didn't read three studies. You read one and wrote three reports.
The overlap between chest and abdomen means you're covering the same anatomy twice in your scroll. The efficiency gain is real, and it's invisible to wRVU. Some groups split the credit. Others don't. The system has no opinion.
11. Subspecialty training.
A fellowship-trained neuroradiologist reading a complex brain tumor follow-up with perfusion, spectroscopy, and post-contrast sequences, comparing to three prior timepoints, dictating a structured report with tumor board implications.
A generalist reading the same study.
Same wRVU. Same pay. One of them spent an extra year (or two) in training and has 15 years of accumulated pattern recognition in this specific disease. The other is doing their best with a different skill set.
I'm not denigrating generalists. I'm saying expertise has value, and the system doesn't price it. A subspecialist reading within their subspecialty is possibly faster, potentially more accurate, and delivers more value to the patient. wRVU pays them identically to someone for whom this isn't their primary domain.
12. Case complexity scoring (already possible).
Here's what's wild about this whole list: most of these factors are already measurable. Every PACS logs reading time. Every study has metadata (patient age, indication, number of priors, number of series). Every report can be analyzed by NLP for findings count, recommendation count, and complexity.
The data to build a fair system exists right now. In your PACS. In your RIS. In your EHR. Nobody is using it for compensation because nobody who controls compensation has an incentive to use it.
13. Image quality and protocol adherence.
A well-positioned, properly-windowed study with the right sequences acquired correctly reads in half the time of a study where the tech cut corners, missed coverage, or used the wrong protocol. The radiologist downstream of a strong tech team is doing a different job than the radiologist cleaning up after a weak one.
Degraded image quality means more equivocal findings, more hedge language, more "cannot exclude" verbiage, and more recommendations for repeat imaging. The tech's work determines the floor of what the radiologist can deliver. wRVU doesn't know if the images were perfect or garbage. It pays the same either way.
Section 5 III. Workflow & Environment
14. Time of day.
2pm on a Tuesday vs. 3am on a Saturday. I've done both. They're not the same.
Night reads carry a human cost that doesn't show up on a productivity report. Sleep disruption. Circadian rhythm damage. Family sacrifice. Cognitive impairment (and the research on this is clear: error rates go up at night). Some groups pay a shift differential. Most bake it into a blended rate. wRVU itself assigns identical value to a study read at noon and a study read at 3am.
If you're paying someone to work on Christmas morning at 4am, and you're paying them the same per-wRVU as a Tuesday afternoon, you're telling them their sacrifice doesn't matter. It matters.
15. Efficiency of the system you're reading on.
This one drives me crazy. (I have a pet peeve about slow computers. Slow people I can tolerate, usually. But slow computers, no.)
A PACS that loads studies in half a second with intelligent hanging protocols and two-click prior comparison. vs. a PACS from 2009 that takes 8 seconds per series, crashes twice a shift, and requires you to manually load each prior.
The radiologist on the fast system reads more studies per hour. Not because they're smarter or faster. Because the software is better. They get paid more (in a productivity model) for reading on a better system they didn't build and didn't choose.
The radiologist on the slow system reads fewer studies. Gets paid less. For the same skill, same training, same effort. Because their employer bought cheap software.
wRVU doesn't know what PACS you're sitting in front of. It probably should.
16. Actual time required.
The dirty secret of wRVU: the correlation between wRVU value and actual reading time is loose at best. Some 1.48 wRVU studies take 3 minutes. Some take 20. Some 0.7 wRVU studies are genuinely quick reads. Others are traps that eat 15 minutes and require three comparison studies and a phone call.
The RUC sets values based on surveyed averages. But averages can be misleading at best. The variance within a single CPT code is enormous. And the variance across clinical scenarios (which the CPT code doesn't capture) is even bigger.
If you're paid per wRVU and you know which study types have the best wRVU-to-minutes ratio, you cherry-pick. Everyone knows this happens. Nobody talks about it openly because everyone is doing it. The system incentivizes it.
My father used to say: "You can work harder than the other guy, or smarter to get ahead. But the most successful people do both." wRVU doesn't reward harder. It doesn't reward smarter. It rewards faster. It's a misalignment of incentives, which should be to deliver the best care possible in the most efficient way possible.
17. Transcription method.
At Expert Radiology, we provide paid human transcriptionists for our radiologists (currently along with RadPair + RadAI as an alternative). The transcriptionist produces a clean draft. The radiologist reviews, edits, and signs.
At most other places, the radiologist uses voice recognition (Dragon, PowerScribe, whatever) and edits their own report. They're doing two jobs simultaneously: interpreting the study and transcribing/editing the report.
The radiologist with a transcriptionist is faster, produces cleaner reports, and can focus entirely on the medicine. The radiologist doing their own transcription is slower through no fault of their own.
Same wRVU. But one of them is doing more labor per study. (This is one of the things we offer differently at Expert, and our radiologists love it. It's an efficiency multiplier that doesn't cost the radiologist anything.)
18. Quality of the clinical history provided.
Order: "MRI brain." Indication: blank. Or worse: "pain."
Now you're a detective. You're pulling the patient's chart (if you even have access). You're guessing what the clinical question is. You're reading the study blind to the context that would help you focus your attention.
vs. an order with: "42-year-old female, 3 weeks of progressive right-sided weakness, MRI to evaluate for demyelinating disease vs. mass lesion, prior CT head 2 weeks ago showed subtle low density in left corona radiata."
Same study. One comes with a roadmap. The other comes with nothing. The radiologist with less potentially has to do more work to deliver the same quality. wRVU doesn't compensate for that work.
19. Protocolling burden.
At many practices, radiologists protocol their own studies. You review the order, decide on the sequences, decide on contrast, decide on coverage. For complex MRI cases this can take 3 to 5 minutes per study before you ever read the first image.
This is pre-read physician work. It requires training, judgment, and clinical decision-making. It generates exactly zero wRVU.
Some places have technologists or mid-levels protocol. Some don't. The radiologist who protocols 50 studies a day is doing real work that the radiologist at the shop with tech-protocolling isn't doing. Same wRVU per read.
20. Interruptions.
You're 90 seconds into a complex study. Phone rings. ER wants a wet read on a trauma CT. You stop what you're doing, context-switch, read the trauma, call the results, go back to what you were doing, and spend 30 seconds remembering where you were.
Interruption cost is well-studied in cognitive science. It's not just the time of the interruption. It's the ramp-back-up time. It's the increased error risk. It's the cognitive load of holding two tasks in memory.
Radiologists in high-interruption environments (ER coverage, on-call reading) produce fewer wRVUs per hour not because they're slower, but because they're being interrupted constantly. The system penalizes them for it.
21. Teleradiology vs. on-site.
A teleradiologist doesn't have the patient in the building. Can't walk down the hall and look at the patient. Can't ask the tech "did this patient move during the sequence or is that real?" Can't have a quick hallway conversation with the referring provider.
An on-site radiologist can do all of those things. And those things make reads faster and more confident.
Teleradiology has different constraints. Different information gaps. Different communication challenges. Same wRVU.
I run Expert Radiology, so I'm not anti-telerads. I'm saying the constraints are different and the compensation system is blind to it.
22. Worklist composition.
Some worklists are curated. A subspecialist getting fed neuro studies all day. Some are firehoses - whatever comes in, you read. The radiologist who has to context-switch between pediatric MSK and an adult body CT and a stroke MRI every fifteen minutes is doing harder cognitive work than the radiologist reading 40 mammograms in a row.
Cognitive switching cost is measurable and ignored. Every context switch burns time and increases error risk. A curated worklist lets you stay in one mental mode. A mixed worklist forces you to reboot your brain every few minutes. Same wRVU per study regardless.
Section 6 IV. Downstream Work
23. Communication burden.
Critical results laws require you to pick up the phone and call the ordering provider for certain findings. That call takes time. Sometimes you're on hold. Sometimes the provider isn't available and you're paging, waiting, calling back. Sometimes the conversation is 30 seconds. Sometimes it's 10 minutes because the clinician wants you to walk them through the images.
BI-RADS 0 recalls, BI-RADS 4/5 notifications. Incidental findings requiring follow-up per ACR guidelines. Each one generates communication work.
None of it generates wRVU.
24. Turnaround time pressure.
A STAT ER read with a 15-minute TAT expectation is a different animal than a routine outpatient read that can sit in your worklist for hours. The STAT read adds urgency, stress, and the need to prioritize it above whatever you were currently doing (see: interruptions, above).
Same wRVU. Different cortisol.
Some groups have TAT bonuses. Most don't. wRVU doesn't know or care how fast you turned it around. Which is strange, because the clinical value of a fast read is potentially substantially higher than a slow one. Ask any ER doc. Maybe if we better incentivized perfect quality, and efficient/fast TAT, radiology could better serve the world.
25. Addendum and callback burden.
You sign a report. An hour later, the clinician calls with new clinical information. You re-review the study. You issue an addendum. That addendum generates zero wRVU.
Or you get a peer review case where your original read is questioned. You spend 20 minutes pulling the case, re-reading it, responding. Zero wRVU.
Or the ordering provider entered the wrong clinical history and now you need to re-dictate with the correct context. Zero wRVU.
All real work. All invisible.
26. Multi-modality correlation.
The patient had a CT chest last week and now has an MRI of the thoracic spine. To read the MRI properly, you need to pull the CT and correlate the findings. Is that vertebral body signal change you're seeing on MRI the same thing that was an incidental finding on CT? You have to check.
This cross-referencing is expected. It's standard of care. And it adds time. Sometimes significant time if you're pulling studies from a different facility or a different PACS.
Zero additional wRVU.
27. Second-look and overread burden.
Residents, fellows, and PAs do preliminary reads at many institutions. The attending radiologist reviews the study, reviews the prelim, and signs the final report.
The attending's "reading time" on these studies is shorter because someone else did the first pass. But the responsibility is 100% the attending's. The liability is 100% the attending's. And the cognitive task is different from a cold read. You're doing pattern recognition AND checking someone else's work AND deciding whether to agree, modify, or rewrite.
wRVU doesn't know if you cold-read the study or overread a prelim. Pays the same.
Section 7 V. Risk & Invisible Labor
28. Medicolegal risk.
Mammography carries more malpractice exposure per study than almost anything else in radiology. Emergency CT head in a stroke workup. Trauma CT. Pediatric imaging where a missed finding has decades of consequences.
Compare that to an outpatient knee X-ray in a 40-year-old runner.
Risk is not priced into the unit. A radiologist reading high-risk studies all day is exposed to more liability per wRVU than a radiologist reading low-risk studies. But the compensation is identical.
Insurance premiums know the difference. wRVU doesn't.
29. Follow-up imaging responsibility.
When you recommend "follow-up CT in 6 months for pulmonary nodule," you've just created a chain of medicolegal responsibility. If that patient doesn't get the follow-up, and the nodule turns out to be cancer, the question of who is responsible for tracking the recommendation will come up in a courtroom.
Some practices have automated tracking systems. Most don't. The downstream liability of a recommendation persists long after the wRVU was counted.
Lung-RADS nodule tracking. BI-RADS short-interval follow-ups. Bosniak cyst surveillance. Every one of these creates future risk. wRVU counts the initial read and nothing after.
30. Peer review, tumor boards, and quality work.
Radiologists who participate in peer review, tumor boards, multidisciplinary conferences, and quality improvement generate zero wRVU for that time. These activities make the practice better. They improve patient care. They reduce errors.
But if your compensation is purely wRVU-based, every hour in tumor board is an hour you're not getting paid. Which creates an incentive to skip tumor board. Which is bad for patients. Again.
The system punishes the radiologists who care most about quality. That's not a feature. That's a bug.
Section 8 The Private Equity Problem
So if wRVU is broken (and it is), surely someone has tried to fix it?
They have. Sort of.
vRad (now part of Radiology Partners, which is PE-backed) developed Time-Based Units (TBUs). The idea was to weight studies by actual reading time instead of the RUC's surveyed averages. In theory, fairer. In practice: who measured the time? Who set the weights? And more importantly, who benefits from the reweighting?
When a PE-owned company that employs thousands of radiologists redesigns the compensation metric, the question you should ask isn't "is this more fair?" It's "more fair for whom?"
Radiology Partners. vRad. LucidHealth. USRS. SimonMed. Premier. Rayus. Rezolut. Envision. MedQuest. These are PE-backed or PE-owned groups that collectively process millions of radiology studies per year. A 5% shift in how studies are weighted across that volume is worth tens of millions of dollars annually. It either goes to the radiologists or it goes to the investors, or, to be fair, a combination. Guess which one PE optimizes for.
I'm not saying TBUs or other proprietary weighting systems are worthless. I'm saying the incentive behind them deserves scrutiny. If the system were designed for radiologists, radiologists would have designed it. Instead, it was designed by the companies that pay radiologists. And those companies have shareholders.
Howard Marks once wrote: "Our industry is full of people who were right once in a row." The RUC was a reasonable idea in 1992. PE-built metrics were a reasonable response to the RUC's failures. Being right once doesn't make you the permanent authority on physician compensation. Especially when your second act is designed to protect margins, not doctors.
It's the same problem as the RUC, just wearing different clothes. A small group of people with a financial interest in the outcome gets to define how value is measured. The people actually creating the value (you, the radiologist reading the study at 3am) don't get a seat at the table.
"Show me the incentive and I will show you the outcome." It keeps coming back to Munger.
Section 9 What a Real System Would Look Like
Here's what I think. I think we can build a better system. Not a perfect one. Perfection doesn't exist in compensation design. But a dramatically better one.
I'm calling it AI-Weighted Relative Value - AI-wRVU, or AI: Where Radiologists Vanish Ultimately. The idea is simple:
Take the same total dollars currently being paid to radiologists. Don't add money. Don't remove money. Revenue-neutral. Same pie. Just cut it more honestly.
Then use AI and existing data to weight each study by what it actually required, not what a committee in 1992 guessed it might require on average.
Here's what the system would consider (and here's the key part: all of this data already exists):
Study-level metadata. CPT code, patient age, clinical indication, time of day, care setting (OP/IP/ER), number of prior studies in PACS. All of this is in your RIS right now.
Actual reading time. PACS logs when you open a study and when you sign the report. This data exists. It's granular. It's study-specific. It's not a survey average. It's what actually happened.
Complexity scoring. Natural Language Processing (NLP) on the final report: number of findings, number of recommendations, number of comparison studies referenced, structured reporting framework used, length of impression. Report complexity is a measurable proxy for study complexity.
Subspecialty match. Was this study read by a fellowship-trained subspecialist within their domain? Or by a generalist? Both have value. But the match matters, and it's knowable.
Communication events. Did the read generate a critical results call? A callback? An addendum? These are logged in most systems.
System efficiency factor. Average load time per study on the platform the radiologist is using. Normalize for system speed so you're measuring the radiologist, not the PACS.
The formula would look something like this:
AI-wRVU = base_wRVU x (setting x age x indication x priors x time_of_day x series x complexity x subspecialty x communication x transcription) Each factor calibrated from real data across millions of studies. Not surveys. Not committee votes. Data.
Revenue-neutral means: the sum of all AI-wRVUs across the system equals the sum of all traditional wRVUs. Nobody's pie gets bigger. The slices just get cut more honestly. Radiologists reading harder studies get paid more. Radiologists reading easier studies get paid less. Total cost to the healthcare system: identical.
Two paths here. The urgent one: any group can build this internally. Tomorrow. You already have the PACS data, the RIS metadata, the reports. An internal AI-wRVU doesn't need CMS approval. Doesn't need the RUC's blessing. It's your group, your data, your definition of fair.
The slower path: eventually, this (or something like it) should replace the Medicare Physician Fee Schedule methodology. That's a longer fight. Validation studies. Pilot programs. Political will that doesn't currently exist.
But the internal version proves the concept. You build it inside your group first. You show it works. You publish the methodology. Then you let the evidence make the argument to CMS. Don't wait for the government to fix this. Fix it yourself and show them what you built.
Section 10 Yes, People Will Try to Game It
This is the first thing everyone says. "People will game it."
Fair. Any compensation system can be gamed. The question is whether the new system is harder to game than the old one.
The current system is trivially easy to game. Cherry-pick high-wRVU-to-minute-ratio studies. Avoid complex cases. Sign faster, review fewer priors, skip the phone call, dictate a shorter report. Every radiologist in a productivity model knows which studies are "good" wRVUs and which ones are traps. Human nature does the rest. The gaming is invisible because the metric is too crude to detect the difference between a thorough read and a fast one.
A multi-factor system is harder to game because you'd have to game multiple variables at once. Let's walk through the obvious attacks.
"I'll just leave cases open longer to inflate my reading time." Reading time is one factor among many. If your report is two lines and the study metadata says normal outpatient chest X-ray, a long reading time doesn't inflate your score. It flags you as an outlier. A 20-minute read on a study everyone else finishes in 3 minutes earns you a conversation with your department chair, not a bonus. Or you can disregard outliers.
"I'll write longer reports to inflate my complexity score." NLP is not a word counter. It can tell the difference between a report with 12 real findings and a report that restates "no acute abnormality" in 14 different ways. Findings density, recommendation count, structured reporting elements (BI-RADS, Lung-RADS, PI-RADS categorizations), comparison references. These are specific, auditable markers. Padding a report with filler doesn't generate findings. It generates noise.
"I'll make unnecessary critical results calls." Communication events are logged. If you're calling critical results on studies where no critical finding exists in your report, that's not gaming. That's a quality problem. And communication is one factor among ten. It nudges the score. Doesn't dominate it.
"I'll request more priors to inflate my comparison count." Prior count is pulled from PACS metadata automatically. The system knows how many priors exist for that patient, not how many you chose to open. Eight prior CTs? The system weights the case accordingly whether you reviewed them or not. (You should. That's the standard of care. The system just acknowledges it's more work.)
Here's the deeper point. Gaming wRVU requires one optimization: pick the studies with the best ratio of wRVU to effort. One lever. Easy.
Gaming a multi-factor system? You'd have to inflate reading time without triggering outlier detection, pad report complexity without triggering quality filters, fabricate communication events with an audit trail watching, and somehow manipulate patient demographics you don't control. Ten variables at once, all cross-referenced against each other and against population baselines. Good luck.
Can someone still find an edge? Probably. Someone always does. But the effort required to game a multi-factor system exceeds the effort of just reading the study properly. That's the whole point. You want a system where the path of least resistance is doing good work.
And unlike wRVU, this system learns. Recalibrate the weights every quarter against actual data. Outlier patterns that work in Q1 get caught by Q3. wRVU can't self-correct because it doesn't have enough information to know what's going wrong. A system watching 10 variables can.
Is it perfect? No. But the bar isn't perfection. The bar is better than what we have. And what we have is a system so easy to game that we don't even call it gaming. We call it "being productive."
You couldn't build this in 1992. You can build it now. The PACS data exists. The NLP exists. The AI to weight and calibrate exists. The computing power exists.
What doesn't exist is the will. Because the people who benefit from the current system are the same people who would have to change it. CMS has no incentive to invest in a better system. The RUC has no incentive to make itself obsolete. PE groups have no incentive to build a metric that might pay their radiologists more fairly at the expense of their margins.
So who builds it?
Radiologists. Independent groups. People who actually sit in front of a PACS and know that an 80-year-old's postop MRI and a 30-year-old's screening study are not the same job.
Section 11 The Ask
I'm not pretending this is a finished proposal. It's a first draft. There are details to work out (how do you handle edge cases? how do you weight the factors initially? what's the right recalibration cadence?). But the direction is clear.
The current system measures one thing: what CPT code was billed. That's it. One variable. In 2026. When we have AI that can read the studies themselves, we're still paying radiologists based on a single code assigned in the 1990s by a committee that meets three times a year.
There are at least 30 factors that affect how hard a study is to read. wRVU captures approximately one of them, poorly.
The RUC won't fix this. It's the RUC's job to maintain the current system, not replace it.
PE won't fix this. PE profits from the information asymmetry between what a study is worth and what a radiologist gets paid.
Unless you're an heir or heiress, your time is the most valuable thing you have. You might as well get paid fairly for how you spend it.
The only people who will fix this are radiologists who are tired of being paid by a metric that can't tell the difference between a 3-minute read and a 22-minute read.
The technology exists. The data exists. The math isn't even that hard.
It is time we build it.

Written by
Avery J. Knapp Jr., M.D.
Board Certified Radiologist, Neuroradiology

Reviewed by
Chad Barker, M.D.
Musculoskeletal Specialist