The Unjournal Data Blog - Inside The Unjournal: What Evaluators Tell Us

Overview

We’ve collected feedback from evaluators through survey questions in our evaluation forms. With over 100 completed evaluations from both academic and applied streams, we can now share insights about time commitments, evaluator experience, and process feedback.

Time Commitment

Evaluators spend an average of 9.6 hours (median: 8 hours) on an evaluation, ranging from 3 to 32 hours. This reflects differences in paper complexity and evaluator approaches.

Distribution of evaluation time

Evaluator Experience

Our evaluators are highly experienced. From 78 respondents who reported years in field, the median is 10 years (mean: 9.3 years), with 44 (56%) reporting 10+ years.

For review experience, 70 respondents provided counts, with a median of 22.5 papers reviewed (mean: 41.8). 11 evaluators (16%) have reviewed 100+ papers.

Evaluator experience distribution

Papers reviewed (log scale)

Sample text responses

Sample evaluator experience
Years.in.Field	Papers.Reviewed
Eighteen years working on digital advertising, thirty years evangelizing for the use of field experiments in economics.	Probably around 200 over 30 years
More than 10 years	Close to 100
15	25+
10 years	>100
10 years	100+

Process Feedback

38 evaluators provided feedback on our template and process.

Highlights of positive feedback:

“two enthusiastic thumbs up” • “9.5/10”
“Process was clear, template was helpful”
“Very thorough evaluation… enjoyed providing point estimates alongside text-based review”

Key areas for improvement:

Questionnaire length: “Too many questions,” “annoying after spending time on referee report”
Usability: “Sliders are confusing/wonky,” “metrics with intervals is confusing”
Clarity: “Would be good to have worked examples,” “process not intuitive”

View all process feedback

Willingness to Re-evaluate

Out of 41 responses, 27 included “yes” (~66%) when asked about evaluating revised versions. This high engagement suggests evaluators find the process worthwhile despite friction points.

Fields Represented

20 evaluators provided their field of expertise, representing diverse backgrounds including development economics, experimental economics, environmental economics, political science, AI safety, meta-analysis, and more.

View all fields of expertise

20 evaluator fields
Field.Expertise
Field experiments on the effects of advertising campaigns
I conduct impact evaluations related to the effectiveness of social safety net programs and have studied cash vs food preferences in similar contexts to those studied in this paper.
small scale irrigation (social science)
moral psychology, behavioural economics
Development Studies/ Agricultural Economist
Meta-analysis
Experimental economics
Political Science; Authoritarian Politics; MENA Politics; Experimental Methods
Development economics, with some work on irrigation.
Political economy, media and politics
experimental economics
Experimental and behavioral economics
Applied econometrics
welfare economics and normative ethics
Science-of-science
I work on integrated assessments and climate policy and cost-benefit analysis more in general.
Development economics
AI Safety
Land use and conservation
Environmental Economics

What We’re Learning

Time commitment is substantial but manageable: Median of 8 hours for high-quality peer review
Evaluators are highly qualified: 10+ years experience, 100+ prior reviews
Process needs refinement: Appreciate thoroughness but find sliders/questions cumbersome
Strong engagement: High re-evaluation willingness indicates value despite friction

Conclusion

Analysis of 111 evaluations reveals The Unjournal successfully engages experienced researchers in substantive reviews averaging ~10 hours. While room for improvement exists—particularly around usability—evaluators remain willing to participate, suggesting we provide value justifying the time investment.

We’re continuing to improve the evaluation interface based on this feedback, analyzing our quantitative ratings to identify which metrics are most useful or redundant (see our other data analysis posts). We’re also seeking funding and partnerships to transition to a new evaluation platform, given PubPub’s sun-setting. Looking ahead, we aim to provide tools helping evaluators better calibrate their ratings and predictions.

Data & Privacy

Analysis code and data (excluding confidential information) available in our GitHub repository.

Privacy: This dataset excludes confidential comments, COI information, and evaluator pseudonyms.