SOTAVerified|Agents Browse Leaderboard About Blog

Ethics

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 21–30 of 832 papers

Title	Date	Tasks	Status	Hype
E-EVAL: A Comprehensive Chinese K-12 Education Evaluation Benchmark for Large Language Models	Jan 29, 2024	EthicsMultiple-choice	CodeCode Available	1
A Survey of Large Language Models for Healthcare: from Data, Technology, and Applications to Accountability and Ethics	Oct 9, 2023	EthicsFairness	CodeCode Available	1
CATS: Conditional Adversarial Trajectory Synthesis for Privacy-Preserving Trajectory Data Publication Using Deep Learning Approaches	Sep 20, 2023	EthicsGraph Matching	CodeCode Available	1
Sight Beyond Text: Multi-Modal Training Enhances LLMs in Truthfulness and Ethics	Sep 13, 2023	EthicsTruthfulQA	CodeCode Available	1
TeD-SPAD: Temporal Distinctiveness for Self-supervised Privacy-preservation for video Anomaly Detection	Aug 21, 2023	Anomaly DetectionAttribute	CodeCode Available	1
Large Language Models to Identify Social Determinants of Health in Electronic Health Records	Aug 11, 2023	Adversarial RobustnessEthics	CodeCode Available	1
Brain tumor segmentation using synthetic MR images -- A comparison of GANs and diffusion models	Jun 5, 2023	Brain Tumor SegmentationEthics	CodeCode Available	1
Do the Rewards Justify the Means? Measuring Trade-Offs Between Rewards and Ethical Behavior in the MACHIAVELLI Benchmark	Apr 6, 2023	Decision MakingEthics	CodeCode Available	1
Synthetically generated text for supervised text analysis	Mar 28, 2023	ArticlesEthics	CodeCode Available	1
AI for Global Climate Cooperation: Modeling Global Climate Negotiations, Agreements, and Long-Term Cooperation in RICE-N	Aug 15, 2022	EthicsMulti-agent Reinforcement Learning	CodeCode Available	1

Show:10 25 50

← PrevPage 3 of 84Next →

All datasets ETHICS Ethics (per ethics)

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	RuGPT-3 Large	Accuracy	68.6	—	Unverified
2	RuGPT-3 Meduim	Accuracy	68.3	—	Unverified
3	RuGPT-3 Small	Accuracy	55.5	—	Unverified
4	Human benchmark	Accuracy	52.9	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	Human benchmark	Accuracy	67.6	—	Unverified
2	RuGPT-3 Small	Accuracy	60.9	—	Unverified
3	RuGPT-3 Large	Accuracy	44.9	—	Unverified
4	RuGPT-3 Medium	Accuracy	44.1	—	Unverified