SOTAVerified

Ethics

Papers

Showing 2650 of 832 papers

TitleStatusHype
XTRUST: On the Multilingual Trustworthiness of Large Language ModelsCode1
Scruples: A Corpus of Community Ethical Judgments on 32,000 Real-Life AnecdotesCode1
Automated Kantian Ethics: A Faithful ImplementationCode1
Has Multimodal Learning Delivered Universal Intelligence in Healthcare? A Comprehensive SurveyCode1
Can Machines Learn Morality? The Delphi ExperimentCode1
Sight Beyond Text: Multi-Modal Training Enhances LLMs in Truthfulness and EthicsCode1
PASS: An ImageNet replacement for self-supervised pretraining without humansCode1
NewsBench: A Systematic Evaluation Framework for Assessing Editorial Capabilities of Large Language Models in Chinese JournalismCode1
Large Language Models to Identify Social Determinants of Health in Electronic Health RecordsCode1
Deontological Ethics By Monotonicity Shape ConstraintsCode1
MoralBench: Moral Evaluation of LLMsCode1
A Survey of Large Language Models for Healthcare: from Data, Technology, and Applications to Accountability and EthicsCode1
Artificial Intelligence Ethics and Safety: practical tools for creating "good" modelsCode1
AI for Global Climate Cooperation: Modeling Global Climate Negotiations, Agreements, and Long-Term Cooperation in RICE-NCode1
CATS: Conditional Adversarial Trajectory Synthesis for Privacy-Preserving Trajectory Data Publication Using Deep Learning ApproachesCode1
Brain tumor segmentation using synthetic MR images -- A comparison of GANs and diffusion modelsCode1
Language Model Alignment in Multilingual Trolley ProblemsCode1
VERB: Visualizing and Interpreting Bias Mitigation Techniques for Word RepresentationsCode1
A Framework for Understanding and Visualizing Strategies of RL AgentsCode0
Exploring and steering the moral compass of Large Language ModelsCode0
Ethics Whitepaper: Whitepaper on Ethical Research into Large Language ModelsCode0
Cross-model Fairness: Empirical Study of Fairness and Ethics Under Model MultiplicityCode0
Ethical-Advice Taker: Do Language Models Understand Natural Language Interventions?Code0
Edu-Values: Towards Evaluating the Chinese Education Values of Large Language ModelsCode0
HumaniBench: A Human-Centric Framework for Large Multimodal Models EvaluationCode0
Show:102550
← PrevPage 2 of 34Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1RuGPT-3 LargeAccuracy68.6Unverified
2RuGPT-3 MeduimAccuracy68.3Unverified
3RuGPT-3 SmallAccuracy55.5Unverified
4Human benchmarkAccuracy52.9Unverified
#ModelMetricClaimedVerifiedStatus
1Human benchmarkAccuracy67.6Unverified
2RuGPT-3 SmallAccuracy60.9Unverified
3RuGPT-3 LargeAccuracy44.9Unverified
4RuGPT-3 MediumAccuracy44.1Unverified