| Morality is Non-Binary: Building a Pluralist Moral Sentence Embedding Space using Contrastive Learning | Jan 30, 2024 | Contrastive LearningEthics | CodeCode Available | 0 |
| Learning From Revisions: Quality Assessment of Claims in Argumentation at Scale | Jan 25, 2021 | Ethics | CodeCode Available | 0 |
| Learning Human Action Recognition Representations Without Real Humans | Nov 10, 2023 | Action RecognitionEthics | CodeCode Available | 0 |
| ACL Ready: RAG Based Assistant for the ACL Checklist | Aug 7, 2024 | EthicsLanguage Modeling | CodeCode Available | 0 |
| More RLHF, More Trust? On The Impact of Preference Alignment On Trustworthiness | Apr 29, 2024 | EthicsLanguage Modelling | CodeCode Available | 0 |
| Towards a multi-stakeholder value-based assessment framework for algorithmic systems | May 9, 2022 | Ethics | CodeCode Available | 0 |
| How Trustworthy are Open-Source LLMs? An Assessment under Malicious Demonstrations Shows their Vulnerabilities | Nov 15, 2023 | EthicsFairness | CodeCode Available | 0 |
| MT-GenEval: A Counterfactual and Contextual Dataset for Evaluating Gender Accuracy in Machine Translation | Nov 2, 2022 | counterfactualEthics | CodeCode Available | 0 |
| A Recommendation and Risk Classification System for Connecting Rough Sleepers to Essential Outreach Services | Jul 30, 2020 | EthicsGeneral Classification | CodeCode Available | 0 |
| HumaniBench: A Human-Centric Framework for Large Multimodal Models Evaluation | May 16, 2025 | BenchmarkingEthics | CodeCode Available | 0 |
| Decorrelation using Optimal Transport | Jul 11, 2023 | Binary ClassificationEthics | CodeCode Available | 0 |
| Surveying Professional Writers on AI: Limitations, Expectations, and Fears | Apr 7, 2025 | EthicsMisinformation | CodeCode Available | 0 |
| ApplE: An Applied Ethics Ontology with Event Context | Feb 7, 2025 | Ethics | CodeCode Available | 0 |
| What are People Talking about in #BlackLivesMatter and #StopAsianHate? Exploring and Categorizing Twitter Topics Emerging in Online Social Movements through the Latent Dirichlet Allocation Model | May 29, 2022 | Ethics | CodeCode Available | 0 |
| Analyzing the Safety of Japanese Large Language Models in Stereotype-Triggering Prompts | Mar 3, 2025 | Ethics | CodeCode Available | 0 |
| Data Defenses Against Large Language Models | Oct 17, 2024 | Ethics | CodeCode Available | 0 |
| A Low-Cost Ethics Shaping Approach for Designing Reinforcement Learning Agents | Dec 12, 2017 | Ethicsreinforcement-learning | CodeCode Available | 0 |
| A Group-Specific Approach to NLP for Hate Speech Detection | Apr 21, 2023 | Common Sense ReasoningEthics | CodeCode Available | 0 |
| Semantics derived automatically from language corpora contain human-like biases | Aug 25, 2016 | BIG-bench Machine LearningEthics | CodeCode Available | 0 |
| When Ethics and Payoffs Diverge: LLM Agents in Morally Charged Social Dilemmas | May 25, 2025 | EthicsNavigate | CodeCode Available | 0 |
| An Integrative Survey on Mental Health Conversational Agents to Bridge Computer Science and Medical Perspectives | Oct 25, 2023 | EthicsExperimental Design | CodeCode Available | 0 |
| Thorny Roses: Investigating the Dual Use Dilemma in Natural Language Processing | Apr 17, 2023 | EthicsSurvey | CodeCode Available | 0 |
| A History of Philosophy in Colombia through Topic Modelling | Dec 5, 2024 | ArticlesEthics | CodeCode Available | 0 |
| Informed AI Regulation: Comparing the Ethical Frameworks of Leading LLM Chatbots Using an Ethics-Based Audit to Assess Moral Reasoning and Normative Values | Jan 9, 2024 | Decision MakingEthics | CodeCode Available | 0 |
| TAPE: Assessing Few-shot Russian Language Understanding | Oct 23, 2022 | Adversarial AttackAdversarial Text | CodeCode Available | 0 |