AI Template Creator - Create Professional Templates with AI

CSAT and QA Score Correlation Review

Compares internal QA evaluation scores against post-call CSAT ratings to validate that the scorecard accurately predicts real customer satisfaction outcomes. Identifies gaps between agent performance assessments and customer perception.

Review Context & Attribution

What is the review period covered by this correlation analysis?
Select the cadence period this review covers (e.g., monthly, quarterly).
Which team, queue, or business unit is being reviewed?
Specify the team, skill group, or product line (e.g., 'Tier 1 Support – EMEA').
What is the total number of interactions included in this analysis?
Enter the sample size of QA-evaluated interactions that also have a matched CSAT response.
What was the CSAT response rate for this period?
Enter as a percentage (e.g., 18%). Low response rates (<10%) reduce correlation reliability.
Who conducted this correlation review?
Name and role of the QA analyst or program manager completing this review.

Aggregate Score Alignment

The average QA score for this period accurately reflects the level of service customers reported receiving.
Rate alignment: 1 = Strongly disagree (large gap between QA avg and CSAT avg) → 5 = Strongly agree (scores closely mirror each other).
The distribution of QA scores (high / mid / low) mirrors the distribution of CSAT ratings (promoter / passive / detractor) for the same interactions.
1 = Strongly disagree → 5 = Strongly agree.
If aggregate alignment is rated 3 or below, describe the nature of the gap (e.g., QA scores high but CSAT low, or vice versa).
Provide directional detail: which metric is inflated or deflated relative to the other, and by how much.
Interactions scored 90%+ on QA consistently receive a CSAT rating of 4 or 5 out of 5.
1 = Strongly disagree → 5 = Strongly agree. This tests whether top QA scores predict promoter-level satisfaction.
Interactions scored below 70% on QA consistently receive a CSAT rating of 1 or 2 out of 5.
1 = Strongly disagree → 5 = Strongly agree. This tests whether low QA scores predict detractor-level satisfaction.

Scorecard Criteria Validity

Which QA scorecard section or criterion shows the STRONGEST positive correlation with high CSAT scores?
Name the specific criterion (e.g., 'Empathy & Tone', 'First Contact Resolution', 'Accurate Information Provided').
Which QA scorecard section or criterion shows the WEAKEST or NEGATIVE correlation with CSAT scores?
Identify criteria where high QA scores do not translate to high customer satisfaction — these are calibration candidates.
The current scorecard weighting (point values per criterion) reflects what customers actually care about most.
1 = Strongly disagree → 5 = Strongly agree. Misaligned weighting is a leading cause of QA-CSAT divergence.
Describe any scorecard criteria you believe are over-weighted relative to their impact on customer satisfaction.
Include the criterion name, current weight, and your recommended adjustment with rationale.
Are there customer satisfaction drivers surfaced in CSAT verbatim comments that are NOT currently captured by the QA scorecard?
Select Yes or No. If Yes, detail the missing drivers in the follow-up field.
If yes, describe the unscored satisfaction drivers identified in CSAT verbatim feedback.
Examples: 'Customers frequently cite hold time as a dissatisfier — not currently scored' or 'Customers value proactive follow-up, which has no scorecard item'.

Evaluator Calibration & Consistency

QA evaluators are applying scoring criteria consistently across agents and interaction types.
1 = Strongly disagree → 5 = Strongly agree. Inconsistent calibration inflates or deflates QA scores independently of actual performance.
Inter-rater reliability (IRR) sessions have been conducted this period to align evaluator standards.
Select Yes, No, or Partially. Lack of calibration sessions is a common root cause of QA-CSAT divergence.
Evaluator halo or recency bias appears to be inflating QA scores for certain agents or time periods.
1 = Strongly disagree (no bias detected) → 5 = Strongly agree (clear bias pattern observed).
If evaluator bias or inconsistency is suspected, describe the pattern and the agents or evaluators involved.
Be specific: e.g., 'Evaluator A scores Agent X 8-10 points higher than the team average on soft-skills criteria with no CSAT support'.

Detractor & Outlier Analysis

How many interactions received a high QA score (≥85%) but a low CSAT rating (1-2 out of 5) this period?
Enter the count. These 'false positive' interactions are the most valuable for scorecard recalibration.
How many interactions received a low QA score (<70%) but a high CSAT rating (4-5 out of 5) this period?
Enter the count. These 'false negative' interactions may indicate over-penalization of criteria customers don't value.
Root causes have been identified for the majority of high-QA / low-CSAT outlier interactions.
1 = Strongly disagree → 5 = Strongly agree.
Summarize the most common root causes identified in high-QA / low-CSAT outlier interactions.
Examples: 'Resolution was technically correct but agent tone was perceived as dismissive', 'Policy constraints frustrated customers despite agent compliance'.
Summarize the most common root causes identified in low-QA / high-CSAT outlier interactions.
Examples: 'Agent deviated from script but customer appreciated the personalized approach', 'Scored down for hold time but customer was satisfied with outcome'.

Action Planning & Program Recommendations

Based on this review, the QA scorecard requires updates to better predict customer satisfaction.
1 = Strongly disagree (scorecard is well-calibrated) → 5 = Strongly agree (significant revision needed).
What specific scorecard changes are recommended as a result of this correlation review?
Include: criteria to add, remove, or reweight; suggested new point values; and the CSAT evidence supporting each change.
What coaching or calibration actions will be taken with evaluators or agents based on this review?
Examples: 'Schedule IRR session for all evaluators in Q3', 'Coach Agent cohort on empathy language linked to detractor verbatims'.
A follow-up correlation review is scheduled to measure the impact of changes made this period.
Select Yes or No. Continuous loop reviews are essential to validate that scorecard changes improve predictive accuracy.
Is there anything else about the QA-CSAT relationship this period that should be documented for program leadership?
Include any contextual factors (e.g., product outages, policy changes, seasonal volume spikes) that may have distorted the correlation this period.

Ask AI Template Studio

Let's customize CSAT and QA Score Correlation Review.

Tell me how you'd like to adapt it. For example:

Add a question about delivery time.
Make it shorter — 5 questions max.
Tailor it for the hospitality industry.
Translate the labels into Spanish.

AI-generated responses may not be 100% accurate