SOTAVerified

Ethics

Papers

Showing 2650 of 832 papers

TitleStatusHype
Inter(sectional) Alia(s): Ambiguity in Voice Agent Identity via Intersectional Japanese Self-Referents0
More-than-Human Storytelling: Designing Longitudinal Narrative Engagements with Generative AI0
Exploring Moral Exercises for Human Oversight of AI systems: Insights from Three Pilot Studies0
Kaleidoscope Gallery: Exploring Ethics and Generative AI Through Art0
HumaniBench: A Human-Centric Framework for Large Multimodal Models EvaluationCode0
Ethics and Persuasion in Reinforcement Learning from Human Feedback: A Procedural Rhetorical Approach0
Clicking some of the silly options: Exploring Player Motivation in Static and Dynamic Educational Interactive Narratives0
Benchmarking Ethical and Safety Risks of Healthcare LLMs in China-Toward Systemic Governance under Healthy China 20300
Societal and technological progress as sewing an ever-growing, ever-changing, patchy, and polychrome quilt0
AI-powered virtual eye: perspective, challenges and opportunities0
Uncertain Machine Ethics Planning0
The Cognitive Foundations of Economic Exchange: A Modular Framework Grounded in Behavioral Evidence0
The GenAI Generation: Student Views of Awareness, Preparedness, and Concern0
Securing the Future of IVR: AI-Driven Innovation with Agile Security, Data Regulation, and Ethical AI Integration0
Federated learning, ethics, and the double black box problem in medical AI0
Generative AI in Education: Student Skills and Lecturer Roles0
The Convergent Ethics of AI? Analyzing Moral Foundation Priorities in Large Language Models with a Multi-Framework Approach0
AI Ethics and Social Norms: Exploring ChatGPT's Capabilities From What to How0
Approaches to Responsible Governance of GenAI in Organizations0
Evaluation Framework for AI Systems in "the Wild"0
Achieving Distributive Justice in Federated Learning via Uncertainty QuantificationCode0
Giving AI a voice: how does AI think it should be treated?0
Values in the Wild: Discovering and Analyzing Values in Real-World Language Model Interactions0
FarsEval-PKBETS: A new diverse benchmark for evaluating Persian large language models0
Framework, Standards, Applications and Best practices of Responsible AI : A Comprehensive Survey0
Show:102550
← PrevPage 2 of 34Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1RuGPT-3 LargeAccuracy68.6Unverified
2RuGPT-3 MeduimAccuracy68.3Unverified
3RuGPT-3 SmallAccuracy55.5Unverified
4Human benchmarkAccuracy52.9Unverified
#ModelMetricClaimedVerifiedStatus
1Human benchmarkAccuracy67.6Unverified
2RuGPT-3 SmallAccuracy60.9Unverified
3RuGPT-3 LargeAccuracy44.9Unverified
4RuGPT-3 MediumAccuracy44.1Unverified